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1 MGALARALLLPLLAQWLLRA RPRDPEWNDESSLVRHRWK 518 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 
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4 
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0 
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asp 
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PRO 
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ALIGNMENTS 



RESULT 1 
AAW61362 

ID AAW61362 standard; protein; 518 AA. 
XX 

AC AAW61362; 
XX 

DT 25-MAR-2003 (revised) 

DT 25-SEP-1998 (first entry) 

XX 

DE Aspartic proteinase ASP1. 
XX 

KW ASP1; Aspartic proteinase; Alzheimer's disease; cancer; melanoma. 
XX 

OS Homo sapiens . 
XX 

PN EP848062-A2. 



XX 

PD 17-JUN-1998. 
XX 

PF 01-DEC-1997; 97EP-00309648 . 
XX 

PR 14-DEC-1996; 96GB-0002 6022 . 

PR 06-OCT-1997; 97US-00999723 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 

PA (SMIK ) SMITHKLINE BEECHAM PLC. 
XX 

PI Powell DJ, Southan C, Chapman CG, Evans JR; 
XX 

DR WPI; 1998-314477/28. 

DR N-PSDB; AAV27962. 
XX 

PT New isolated polynucleotide encodes Aspartic protease polypeptide - used 

PT to diagnosis, treat and vaccinate against Alzheimer 1 s disease, cancer and 

PT melanoma . 
XX 

PS Claim 11; Page 7; 19pp; English. 
XX 

CC The human ASP1 protein is structurally related to other proteins of the 

CC Aspartic proteinase family. ASP 1 polypeptides and polynucleotides can be 

CC used to diagnosis, treat and vaccinate against Alzheimer's disease, 

CC cancer and melanoma. (Updated on 25-MAR-2003 to correct PR field.) 
XX 

SQ Sequence 518 AA; 



Query Match 100.0%; Score 2687; DB 2; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I 
Db 1 MGALARAL LLP LLAQWLLRAAP E LAP AP FT L P L RVAAATN RWAPT P GP GT P AERHAD GL 60 

Qy 61 ALALEPALAS PAGAANFIAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I | | I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 



Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 GSGTNGGS LVLGGI EPSLYKGDIWYTPI KEEWYYQI EI LKLEI GGQSLNLDCREYNADKA 300 



Qy 301 IVDSGTTLLRLPQKVFDAWEAV7VRASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 



Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I I I I I I I I I 1 I I I I I I I II I I I I I I I I I I I I I II I I I I I II I I I I I I I I I 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD '420 

Qy 421 RAQKRVGFAAS PCAE I AGAAVS EISGPFSTE DVASNCVPAQ S L S E P I LWI VS YALMS VCG 480 

I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I 
Db 421 RAQKRVGFAAS PCAE I AGAAVS EISGPFSTE DVASNCVPAQ S L S E P I LWI VS YALMS VCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I II 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 2 
AAY13799 

ID AAY13799 standard; protein; 518 AA. 
XX 

AC AAY13799; 
XX 

DT 21-SEP-1999 (first entry) 
XX 

DE Human aspartyl protease, CSP56. 
XX 

KW CSP56; human; aspartyl protease; diagnosis; neoplasia; tumour; 

KW breast tumour; colon tumour. 

XX 

OS Homo sapiens. 
XX 

PN W09933963-A1. 
XX 

PD 08-JUL-1999. 
XX 

PF 14-DEC-1998; 98WO-US026547 . 
XX 

PR 31-DEC-1997; 97US-0070112P . 
XX 

PA (CHIR ) CHIRON CORP. 
XX 

PI Giese KW, Xin H; 
XX 

DR WPI; 1999-430240/36. 

DR N-PSDB; AAX89297. 
XX 

PT Human CSP56 protein for diagnosis of neoplasia. 
XX 

PS Claim 2; Fig 2A; 51pp; English. 
XX 

CC This represents a human CSP56 protein, a novel aspartyl protease. The 

CC CSP56 protein can be used in methods for diagnosing neoplasia, for 

CC determining the metastatic potential of a tumour, and for screening test 

CC compounds for the ability to suppress the metastatic potential of a 

CC tumour. The tumours are preferably from breast or colon 

XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 2; Length 518; 
Best Local Similarity 100.0%; Pred. No. 8.6e-231; 



Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 MGALARAL LL P LLAQWL L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GL 60 

I I I I I I I II I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1 MGALARAL LL P LLAQWL L RAAP E LAP AP FT L P LRVAAAT N RWAPT P G P GT P AERHAD GL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLWIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 FESENFFLPGIKWNGILGIA.YATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 24 0 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRS FRI TI LPQLYIQPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FD 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
Db 361 YLRDENSSRS FRIT I LPQLYIQPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALM S VC G 480 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 3 
AAY22239 

ID AAY22239 standard; protein; 518 AA. 
XX 

AC AAY22239; 
XX 

DT 20-SEP-1999 (first entry) 
XX 

DE Human CSP56, aspartyl-type protease, protein sequence. 
XX 

KW Metastatic marker protein; human; cancer metastasis; breast cancer; 

KW colon cancer; diagnosis; therapy; tumour; metastatic potential; CSP56; 

KW aspartyl-type protease. 
XX 

OS Homo sapiens. 
XX 

PN WO9934004-A2. 
XX 



PD 08-JUL-1999. 
XX 

PF 24-DEC-1998; 98WO-US027 608 . 
XX 

PR 31-DEC-1997; 97US-007 01 12P . 
XX 

PA (CHIR ) CHIRON CORP. 
XX 

PI Xin H, Giese K; 
XX 

DR WPI; 1999-430248/36. 

DR N-PSDB; AAX84708. 
XX 

PT New polynucleotides associated with cancer metastasis. 
XX 

PS Claim 4; Page 78-80; 80pp; English. 
XX 

CC This sequence represents a polypeptide of the invention, and is an 

CC aspartyl-type protease, designated CSP56. The polynucleotides (PNs) of 

CC the invention encode metastatic marker protein variants. The PNs and 

CC polypeptides can be used as markers for cancer metastasis. The products 

CC can be used for identifying metastatic tissue or metastatic potential of 

CC a tissue, e.g. breast or colon tissue. They can also be used for 

CC screening test compounds for the ability to suppress the metastatic 

CC potential of a tumour. The products can be used for developing products 

CC for the therapy of cancers, particularly breast or colon cancer 

XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 2; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 MGALARAL LLP LLAQWL L RAAP E LAP AP FT L P L RVAAATN RWAPT P G P GT P AE RHAD GL 60 

I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I 
Db 1 MGALARAL L L P L LAQWL LRAAP E LAP AP FT L P LRVAAATN RWAP T P G P GT P AE RHAD GL 60 



Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I I I I I I I I I M I I I I I I II I I I I I I I I I I I M I I I I I I I I M I I I I I I I I M I I I I I M 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDS GRGYYLEMLI GT P PQKLQI LVDTGS SNFAVAG 12 0 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M M I > I I M I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATL7VKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I M I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFTrt/TGSQLACWTNSET PWSYFPKISI 360 

I I I I M I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I II I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 



Qy 361 YLRDENSSRS FRIT I LPQLYIQPMMGAGLNYECYRFGI S PSTNALVIGATVMEGFYVI FD 420 

I | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I II I I I 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N C VPAQ SLSEPILWIVS YALMS VC G 480 

I | | | | | | M I I I I I II I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I M I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N C VPAQ S L S E P I LW I VS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 4 
AAY41714 





AAY41714 standard; protein; 518 AA. 








AC 


AAY41714; 




XX 

AA 








07-DEC-1999 


(first entry) 


XX 








Human PR0852 


protein sequence. 


XX 

Ail 






KW 


Human; PRO; 


EST; expressed sequence tag; PCR primer; hybridisation; 


KW 


probe; blood 


coagulation disorder; cancer; cellular adhesion disorder; 


KW 


secreted protein; transmembrane protein. 


XX 






OS 


Homo sapiens 


• 


XX 






PN 


W09946281-A2 




XX 






PD 


16-SEP-1999. 




XX 






PF 


08-MAR-1999; 


99WO-US005028. 


XX 






PR 


10-MAR-1998; 


98US-0077450P. 


PR 


ll-MAR-1998; 


98US-0077632P. 


PR 


ll-MAR-1998; 


98US-0077641P. 


PR 


ll-MAR-1998; 


98US-0077649P. 


PR 


12-MAR-1998; 


98US-0077791P. 


PR 


13-MAR-1998; 


98US-0078004P. 


PR 


17-MAR-1998; 


98US-00040220. 


PR 


20-MAR-1998; 


98US-0078886P. 


PR 


20-MAR-1998; 


98US-0078910P. 


PR 


20-MAR-1998; 


98US-0078936P. 


PR 


20-MAR-1998; 


98US-0078939P. 


PR 


25-MAR-1998; 


98US-0079294P. 


PR 


26-MAR-1998; 


98US-0079656P. 


PR 


27-MAR-1998; 


98US-0079663P. 


PR 


27-MAR-1998; 


98US-0079664P. 


PR 


27-MAR-1998; 


98US-0079689P. 


PR 


27-MAR-1998; 


98US-0079728P. 


PR 


27-MAR-1998; 


98US-0079786P. 


PR 


30-MAR-1998; 


98US-0079920P. 


PR 


30-MAR-1998; 


98US-0079923P. 


PR 


31-MAR-1998; 


98US-0080105P. 


PR 


31-MAR-1998; 


98US-0080107P. 



PR 


31- 


-MAR- 


1998; 


98US- 


0080165P. 


PR 


31- 


MAR- 


1998; 


98US- 


0080194P. 


PR 


01- 


-APR- 


1998; 


98US- 


0080327P. 


PR 


01- 


-APR- 


1998; 


98US- 


0080328P. 


PR 


01- 


-APR- 


1998; 


98US- 


0080333P. 


PR 


01- 


-APR- 


1998; 


98US- 


0080334P. 


PR 


08- 


-APR- 


1998; 


98US- 


0081049P. 


PR 


08- 


-APR- 


1998; 


98US- 


0081070P. 


PR 


08- 


-APR- 


1998; 


98US- 


0081071P. 


PR 

X» X X 


09- 


-APR- 


1998; 


98US- 


0081195P. 


PR 


09- 


-APR- 


1998; 


98US- 


0081203P. 


PR 


09- 


-APR- 


1998; 


98US- 


0081229P. 


PR 


15- 


-APR- 


1998; 


98US- 


0081817P. 


PR 


15- 


-APR- 


1998; 


98US- 


0081838P. 


PR 


15- 


-APR- 


1998 ; 


98US- 


0081952P. 


PR 


15- 


-APR- 


1998 ; 


98US- 


0081955P. 


PR 


21- 


-APR- 


1998; 


98US- 


0082568P. 


PR 


21- 


-APR- 


1998; 


98US- 


0082569P. 


PR 


22- 


-APR- 


1998 ; 


98US- 


•0082700P, 


PR 


22- 


-APR- 


1998 ; 


98US- 


■0082704P. 


PR 

X. X * 


22- 


-APR- 


1998 ; 


98US- 


-0082804P. 


PR 

X. X X 


23- 


-APR- 


1998; 


98US- 


■0082767P. 


PR 


23- 


-APR- 


•1998; 


98US- 


-0082796P. 


PR 

X- X X 


27- 


-APR- 


■1998; 


98US- 


-0083336P. 


PR 

X. X X 


28- 


-APR- 


■1998; 


98US- 


•0083322P. 


PR 


29- 


-APR- 


•1998; 


98US- 


-0083392P. 


PR 

X» X X 


29- 


-APR- 


■1998; 


; 98US- 


-0083495P. 


PR 

If X X 


29- 


-APR- 


■1998, 


; 98US- 


-0083496P. 


PR 

X- X V 


29- 


-APR- 


-1998, 


; 98US- 


-0083499P. 


PR 

X» X X 


29- 


-APR- 


-1998, 


: 98US- 


-0083500P. 


PR 

X, X X 


29- 


-APR- 


-1998, 


; 98US- 


-0083545P. 


PR 


29- 


-APR- 


-1998, 


; 98US- 


-0083554P. 


PR 


29- 


-APR- 


-1998, 


? 98US- 


-0083558P. 


PR 


29- 


-APR- 


-1998, 


; 98US- 


-0083559P. 


PR 


30- 


-APR- 


-1998, 


; 98US- 


-0083742P. 


PR 


05- 


-MAY- 


-1998, 


; 98US- 


-0084366P. 


PR 


06- 


-MAY- 


-1998, 


; 98US- 


-0084414P. 


PR 


06- 


-MAY- 


-1998, 


; 98US- 


-0084441P. 


PR 


07- 


-MAY- 


-1998 


; 98US- 


-0084598P. 


PR 


07- 


-MAY- 


-1998 


98US- 


-0084600P. 


PR 


07- 


-MAY- 


-1998 


98US- 


-0084627P. 


PR 


07- 


-MAY- 


-1998 


; 98US- 


-0084637P. 


PR 


07- 


-MAY- 


-1998 


98US- 


-0084639P. 


PR 


07 


-MAY- 


-1998 


98US- 


-0084640P. 


PR 


07 


-MAY- 


-1998 


98US- 


-0084643P. 


PR 


13 


-MAY- 


-1998 


; 98US- 


-0085323P. 


PR 


13 


-MAY- 


-1998 


98US- 


-0085338P. 


PR 


13 


-MAY- 


-1998 


98US- 


-0085339P. 


PR 


15 


-MAY- 


-1998 


; 98US- 


-0085573P. 


PR 


15 


-MAY- 


-1998 


98US- 


-0085579P. 


PR 

X. X x 


15 


-MAY- 


-1998 


98US- 


-0085580P. 


PR 


15 


-MAY- 


-1998 


; 98US- 


-0085582P. 


PR 


15 


-MAY- 


-1998 


98US- 


-0085689P. 


PR 


15 


-MAY- 


-1998 


98US- 


-0085697P. 


PR 


15 


-MAY- 


-1998 


98US- 


-0085700P. 


PR 


15 


-MAY- 


-1998 


98US- 


-0085704P. 


PR 


18 


-MAY- 


-1998 


98US- 


-0086023P. 



PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
PT 
PT 
XX 
PS 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
XX 
SQ 



22- 


-MAY- 


1998; 


98US- 


0086392P. 


22 


-MAY- 


1998, 


98US- 


0086414P. 


22 


-MAY- 


1998, 


: 98US- 


0086430P. 


22 


~MAY- 


1998, 


; 98US- 


0086486P. 


28 


-MAY- 


1998, 


: 98US- 


0087098P. 


28 


-MAY- 


1998, 


\ 98US- 


0087106P, 


28 


-MAY- 


1998, 


; 98US- 


0087208P. 


30 


-JUL- 


1998, 


; 98US- 


0094651P. 


11 


-SEP- 


1998, 


; 98US- 


0100038P. 



(GETH ) GENENTECH INC. 

Wood WI, Godciard A, Gurney A, Yuan J, Baker KP, Chen J; 

WPI; 1999-551358/46. 
N-PSDB; AAZ34056. 

New secreted and transmembrane polypeptides and their polynucleotides, 
useful for treating blood coagulation disorders , cancers and cellular 
adhesion disorders. 

Claim 12; Fig 73; 530pp; English. 

The present invention describes secreted and transmembrane polypeptides 
and their polynucleotides. The nucleotide sequences are useful as sources 
of probes, primers, for chromosome mapping, and for generation of 
antisense sequences. They can also be used to create transgenic animals. 
The proteins can be used to treat a variety of diseases and disorders, 
depending on their function. Diseases that may be treated include blood 
coagulation disorders, cancers and cellular adhesion disorders. They may 
also be used to raise antibodies. AAZ33891 to AAZ34338, and AAY41685 to 
AAY41774 represent polynucleotide and polypeptide sequence given in the 
exemplification of the present invention 

Sequence 518 AA; 



Query Match 100.0%; Score 2687; DB 2; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 



MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 



I I | I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I 



120 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 



| I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I 



Qy 



241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 



300 



I I I I I I I I M I I I I I I I I I I I > I > M M I I I I I II I I I I I M I I M I I I I I I I I I I I I I I 

D b " 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

| | | | | | | | I I I I I I I I I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I 
D b 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNJ\LVIGATVMEGFYVIFD 420 

| | | | M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I 
D b 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

| | | | | | | I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVASN CVPAQ S L S E P I LW I VS YALMS VCG 480 

Q y 481 AI LLVLI VLLLLPFRCQRRPRDPEWNDES SLVRHRWK 518 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I 
Db 481 AI LLVLI VLLLLPFRCQRRPRDPEWNDES SLVRHRWK 518 



RESULT 5 


AAY88424 


in 


AAYftft4?4 ^fandard: Drotein; 518 AA. 






AC 


AAY88424 ; 


XX 




DT 


03-AUG-2000 (first entry) 


XX 




DE 


Human aspartyl protease 1 (Aspl) amino acid sequence. 


XX 


Aspartyl protease; aspartase; amyloid precursor protein; APP; Asp 1; 


KW 


KW 


Alzheimer 1 s disease; beta secretase site. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200017369-A2. 


XX 




PD 


30-MAR-2 000. 


XX 




PF 


23-SEP-1999; 99WO-US0208 81 . 


XX 




PR 


24-SEP-1998; 98US-0101594P . 


XX 




PA 


(PHAA ) PHARMACIA & UPJOHN CO. 


XX 




PI 


Gurney ME, Bienkowski MJ, Heinrikson RL, Parodi LA, Yan R; 


XX 




DR 


WPI; 2000-303209/26. 


DR 


N-PSDB; AAA15661. 


XX 


New enzyme designated human aspartase useful in research into Alzheimer's 


PT 


PT 


Disease is capable of cleaving amyloid protein precursor at the beta 


PT 


secretase site to produce amyloid beta peptide. 


XX 




PS 


Claim 54; Fig 1; 183pp; English. 


XX 




CC 


This sequence represents the human aspartyl protease amino acid sequence. 



CC The invention relates to a protease capable of cleaving the beta 

CC secretase site of amyloid precursor protein (APP) . The protease contains 

CC a sequence encoding the amino acid sequence DTG and a sequence encoding 

CC DSG or DTG separated by 100-300 amino acids. When mutated the APP gene 

CC causes an autosomal dominant form of Alzheimer 1 s disease. APP localises 

CC to the cell surface membrane and have a single C-terminal transmembrane 

CC domain. Proteolytic processing of APP produces the amyloid beta protein, 

CC which is possibly very important in Alzheimer's disease. The invention 

CC includes a nucleotide sequence encoding the protease, a vector containing 

CC the nucleotide sequence, and a cell line comprising the vector. Methods 

CC for screening for inhibitors of beta secretase activity are also given in 

CC the invention. The human aspartase protein and nucleotide sequences and 

CC the methods for identifying inhibitors of the protease, are useful in the 

CC treatment of and research in to Alzheimer's disease 
XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 3; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARAL L L P L LAQWL LRAAP E LAP AP FT L P LRVAAATN RWAP T P G P GT P AE RHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I 
Db 1 MGALARAL LLP LLAQWL L RAAP E LAP AP FT L PLRVAAATN RWAP T P G P GT P AE RHAD GL 60 

Qy 61 AIiALEPALASPAGAANFIAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 61 ALALEPALASPAGAANFLAMVX)NLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS SRS FRITI LPQLYIQPMMGAGLNYEC YRFGI S PSTNALVI GATVMEGFYVI FD 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 361 YLRDENS SRS FRIT I LPQLYIQPMMGAGLNYEC YRFGI S PSTNALVI GAT VMEGFYVIFD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALMS VC G 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS E I S G P F S T EDVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 6 
AAB44270 
ID 
XX 



AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 



XX 
PA 
XX 
PI 
PI 
PI 
PI 
PI 
XX 
DR 
DR 
XX 
PT 
PT 
PT 
XX 
PS 



AAB44270 standard; protein; 518 AA. 
AAB44270; 

08-FEB-2001 (first entry) 

Human PR0852 (UNQ418) protein sequence SEQ ID NO: 196. 

Human; secreted protein; transmembrane protein; PRO; EST; cytostatic; 
expressed sequence tag; detection; cancer. 

Homo sapiens* 

WO200053756-A2. 



PD 


14 


-SEP- 


2000. 








XX 














PF 


18 


-FEB- 


2000; 


2 


ooowo- 


■US004341. 


XX 














PR 


08 


-MAR- 


1999; 




99WO- 


■US005028. 


PR 


12 


-MAR- 


1999; 




99US- 


-0123957P. 


PR 


29 


-MAR- 


1999; 




99US- 


-0126773P. 


PR 


21 


-APR- 


1999; 




99US- 


-0130232P. 


PR 


28 


-APR- 


1999; 




99US- 


•0131445P. 


PR 


14 


-MAY- 


1999; 




99US- 


0134287P. 


PR 


23 


-JUN- 


1999; 




99US- 


-0141037P. 


PR 


26 


-JUL- 


1999; 




99US- 


•0145698P. 


PR 


29 


-OCT- 


1999; 




99US- 


•0162506P. 


PR 


30 


-NOV- 


1999; 




99WO- 


■US028313. 


PR 


02 


-DEC- 


1999; 




99WO- 


-US028551. 


PR 


02- 


-DEC- 


1999; 




99WO- 


US028565. 


PR 


16 


-DEC- 


1999; 




99WO- 


■US030095. 


PR 


30 


-DEC- 


1999; 




99WO- 


■US031243. 


PR 


30 


-DEC- 


1999; 




99WO- 


US031274 . 


PR 


05 


-JAN- 


2000; 


2 


ooowo- 


■US000219. 


PR 


06- 


-JAN- 


2000; 


2 


000WO- 


■US000277. 


PR 


06- 


-JAN- 


2000; 


2 


000WO- 


US000376. 



(GETH ) GENENTECH INC. 



Ashkenazi AJ, Baker KP, 

Filvaroff E, 
Godowski PJ, 



Ferrara N, 
Goddard A, 
Kljavin I J, 
Stewart TA, 



Botstein D, Desnoyers L, Eaton DL; 
Fong S, Gao W, Gerber H, Gerritsen 
Grimaldi CJ, Gurney AL f Hillan KJ; 
Kuo SS, Napier MA, Pan J, Paoni NF, 
Tumas D, Williams PM, Wood WI ; 



ME; 



Roy MA, Shelton DL; 



WPI; 2000-611443/58. 
N-PSDB; AAC78500. 

Novel PRO polypeptides and polynucleotides used in detection methods, to 
target bioactive molecules to specific cells, and to modulate cellular 
activities . 

Claim 12; Fig 73; 636pp; English. 



XX 

CC AAC78458 to AAC78599 represent polynucleotide and EST (expressed sequence 

CC tag) sequences which encode secreted or transmembrane PRO polypeptides. 

CC The PRO polynucleotides and polypeptides have cytostatic activity. The 

CC polynucleotides and polypeptides can be used for detecting the presence 

CC of PRO polypeptides in samples, for linking bioactive molecules to cells 

CC and for modulating biological activities of cells, using the polypeptides 

CC for specific targeting. The polypeptide targeting can be used to kill the 

CC target cells, e.g. for the treatment of cancers. The polypeptide pairs 

CC provide specific targeting of bioactive molecules to cells. AAC78600 to 

CC AAC78987 represent PCR primers and probes used in the isolation of the 

CC PRO polynucleotide sequences 
XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 3; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALL L P LLAQWLLRAAP ELAPAP FT L P L RVAAATNRWAP T P GP GTPAERHADGL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I 
Db 1 MGALARALLLPLLAQWLLRAAPEIAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDS GRGYYLEMLI GT P PQKLQI LVDTGS SNFAVAG 120 

I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I 
Db 61 ALALE P ALAS PAGAAN FLAMVDNLQGD S GRGYYLEML I GT P PQKLQ I LVDT GS SN FAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I II I I I II I M I I I I I I I I I I I II ( II I I I I I II I I I I II I I I I I I I I I I II I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I M I I I II I I I M I I I I I I I I I I I II I I I I II I I I I I I II I I I II M I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M M M I II I II I I II I I 

Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAVVKAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

II I I I I I II I I I I I I I I I I I I II II I I I I II I II I I II I I I I I I I I I M I I I I I I I I II I 

Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRIT I LPQLYI QPMMGAGLN YECYRFGI S PSTNALVI GATVMEGFYVI FD 420 

I I II I II II I I I I I I I I I I I I I I I I I I II I I II I I I I I I I II I II I I M I I I I I I I I I I I 

Db 361 YLRDENS SRS FRITI LPQLYI QPMMGAGLN YECYRFGI S P STNALVI GATVMEGFYVI FD 420 

Qy 421 RAQKRVGFAAS PCAE I AGAAVS EISGPFSTE DVAS NC VPAQ S L S E P I LWI VS YALMS VC G 480 

II II I I II I I I I I I I I I I I II I I I I I I I I II I I I I I I I I II M I II M I I I I I I I I I I I I 

Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N C V P AQ S L S E P I LW I VS YALMS VC G 4 80 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I II I I I II I I II I II I I I II I II I I I I I II I II I I I I I 

Db 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 7 



AAU07201 

ID AAU07201 standard; protein; 518 AA. 
XX 

AC AAU07201; 
XX 

DT 24-OCT-2001 {first entry) 
XX 

DE Human aspartyl protease 1 (Asp-1) . 
XX 

KW Human; aspartyl protease 1; Asp-1; nootropic; neuroprotective; 

KW aspartyl protease 2; Asp2 ; amyloid protein precursor; APP; 

KW beta-secretase; Alzheimer's disease. 
XX 

OS Homo sapiens. 
XX 

PN WO200149097-A2. 
XX 

PD 12-JUL-2001. 
XX 

PF 09-MAY-2001; 2001WO-IB000797 . 
XX 

PR 09-MAY-2001; 2001WO-IB000797 . 
XX 

PA (BIEN/) BIENKOWSKI M J. 

PA (GURN/) GURNEY M E. 

PA (HEIN/) HEINRIKSON R L. 

PA (PARO/) PARODI L A. 

PA (YANR/) YAN R. 

XX 

PI Bienkowski MJ, Gurney ME, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2001-502548/55. 

DR N-PSDB; AAS11701. 
XX 

PT Novel purified polypeptide comprising fragment of mammalian aspartyl 

PT protease 2, lacking Asp2 transmembrane domain and retaining beta 

PT secretase activity of Asp2 useful for identifying inhibitors of Asp2 

PT activity. 

XX 

PS Example 2; Fig 1; 185pp; English. 
XX 

CC The invention relates to a novel purified polypeptide comprising a 

CC fragment of mammalian aspartyl protease 2 (Asp2) protein which lacks the 

CC Asp2 transmembrane domain and the Asp2 protein, and where the polypeptide 

CC and the fragment retain the beta-secretase activity of the mammalian Asp2 

CC protein. Also included is an isoform of amyloid protein precursor (APP) 

CC comprising the amino acid sequence of a APP or its fragment containing an 

CC APP cleavage site recognisable by a mammalian beta-secretase f and further 

CC comprising two lysine residues at the carboxyl terminus of the amino acid 

CC sequence of the mammalian APP or APP fragment. The polypeptides are used 

CC for assaying for modulators of beta-secretase activity; identifying 

CC agents that inhibit the APP processing activity of human Asp2 aspartyl 

CC protease (Hu-Asp2) ; identifying agents that modulate the activity of Asp2 

CC ; and for reducing cellular production of amyloid beta (Abeta) from APP. 

CC Agents identified by the above methods are useful for treating 

CC Alzheimer's disease; and for identifying modulators of amyloid-beta 

CC (Abeta) peptide production, for use in designing therapeutics for the 



CC treatment or prevention of Alzheimer's disease. Probes and primers 

CC derived from Asp nucleic acid sequences are useful for detecting Hu-Asp 

CC nucleic acids in in vitro assays and in Northern and Southern blots. The 

CC present sequence represents the amino acid sequence of human Asp-1 

XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARAL L L P L LAQWLLRAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AERHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 AIxALEPALASPAGAANFLAMVT3NLQGDSGRGYYLEMLIGTPPQKLQIL 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I II I! II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I II II I I I I I II 
Db 181 FE S EN FFL PGI KWNGI LGLAYAT LAKP S S S LET FFD S LVTQAN I PNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I M I I I I I I I I I I M I I I I I I II I I I II I I I I I I I II M I I I I I I I I I II I I I I II I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II II I I I I I I I I I I I 
Db 361 YLRDEN S S RS FRIT I LPQL YI Q PMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

Qy 421 RAQ KRVG FAAS PCAE I AGAAVS EISGPFST EDVAS NCVP AQ S L S E P I LWI VS YALMS VCG 480 

I I M II I I II I I I I I I I II II I I I I I I II I I I I I I I I I I II I I I I I I II I I I I I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N CVP AQ S L S E P I LWI VS YALMS VCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I II I I I I I I I I II I I I I I II I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 8 
AAE10628 

ID AAE10628 standard; protein; 518 AA. 
XX 

AC AAE10628; 
XX 

DT 10-DEC-2001 (first entry) 
XX 

DE Human aspartyl protease 1 (hu-Aspl) protein. 



XX 

KW Human; aspartyl protease 1; Aspl; amyloid precursor protein; APP; 

KW Alzheimer's disease; AD; dementia; neurofibrillary tangle; gliosis; 

KW amyloid plaque; neuronal loss; proteolytic; nootropic; neuroprotective; 

KW chromosome 21. 

XX 

05 Homo sapiens. 
XX 

FH Key Location/Qualif iers 

FT Peptide 1. .20 

FT /label= Signal_peptide 

FT Protein 21. .518 

FT /note= "Mature human aspartyl protease 1" 

FT Domain 4 69. .492 

FT /label- Transmembrane_domain 

XX 

PN GB2357767-A. 
XX 

PD 04-JUL-2001. 
XX 

PF 22-SEP-2000; 2000GB-00023315 . 
XX 

PR 23-SEP-1999; 99US-00404133 . 

PR 23-SEP-1999; 99US-0155493P . 

PR 23-SEP-1999; 99WO-US020881 . 

PR 13-OCT-1999; 99US-004 16901 . 

PR 06-DEC-1999; 99US-0169232P . 
XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Bienkowkski MJ, Gurney M; 
XX 

DR WPI; 2001-444208/48. 

DR N-PSDB; AAD17864. 
XX 

PT Polypeptide comprising fragments of human aspartyl protease with amyloid 

PT precursor protein processing activity and alpha-secretase activity, for 

PT identifying modulators useful in treating Alzheimer's disease. 
XX 

PS Claim 36; Fig 1; 187pp; English. 
XX 

CC The patent discloses human aspartyl protease 1 (hu-Aspl) or modified Aspl 

CC proteins which lack transmembrane domain or amino terminal domain or 

CC cytoplasmic domain and retains alpha-secretase activity and amyloid 

CC protein precursor (APP) processing activity. The proteins of the 

CC invention are useful for assaying hu-Aspl alpha-secretase activity, which 

CC in turn is useful for identifying modulators of hu-Aspl alpha-secretase 

CC activity, where modulators that increase hu-Aspl alpha-secretase activity 

CC are useful for treating Alzheimer's disease (AD) which causes progressive 

CC dementia with consequent formation of amyloid plaques, neurofibrillary 

CC tangles, gliosis and neuronal loss. Hu-Aspl protease substrate is useful 

CC for assaying hu-Aspl proteolytic activity, by contacting hu-Aspl protein 

CC with the substrate under acidic conditions and determining the level of 

CC hu-Aspl proteolytic activity. The present sequence is Aspl protein from 

CC human. Aspl gene is localised on chromosome 21 
XX 

SQ Sequence 518 AA; 



Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRVVAPTP^^ 60 

I I I I I I I I I I M I I I I I I ! I I I I I I I I I I I I I M I M I I I N I I I I I I I I I I I I I I I I I I 

Db 1 MGA]J\RALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRW 60 

Qy 61 ALALE P ALAS P AGAAN FLAMVDN LQ GD S G RG Y YL EML I GT P P Q KLQ I LVDT G S S N FAVAG 120 

| M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M II I I I I I I I I 
Db 61 ALALE PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

| | I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I II I I I I I I I I I I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

| M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I II I I I I I I I II I I I 
D b 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Q y 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

| | | M I I I I I I I M I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

| | | | | | | | | I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRITI LPQLYI QPMMGAGLNYECYRFGI S P STNALVT GATVMEGFYVI FD 420 

| M | | | I M I I I I I I I I I I I I I I M I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I 
Db 361 YLRDENS SRS FRITI LPQLYI QPMMGAGLNYECYRFGI SPSTNALVI GATVMEGFYVI FD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS NCVP AQ S L S E P I LW I VS YALMS VC G 480 

|| | | | | | I || I I I I I I I I I I I I I I I I I I I II I I I M I M I I I M I I I I I I II I I I I I I I I 
Db 421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

r.. 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 481 AI LLVLI VLLLLP FRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 9 
AAE10656 

ID AAE10656 standard; protein; 518 AA. 
XX 

AC AAE10656; 
XX 

DT 10-DEC-2001 (first entry) 
XX 

DE Human-Asp 1 protein lacking TM domain and containing (His) 6 tag. 
XX 

KW Human; aspartyl protease 1; Aspl; amyloid precursor protein; APP; 

KW Alzheimer's disease; AD; dementia; neurofibrillary tangle; gliosis; 

KW amyloid plaque; neuronal loss; proteolytic; nootropic; neuroprotective. 

XX 

OS Homo sapiens. 



OS Synthetic. 
XX 

PN GB2357767-A. 
XX 

PD 04-JUL-2001. 
XX 

PF 22-SEP-2000; 2 000GB-00023315 . 
XX 

PR 23-SEP-1999; 99US-00404133 . 

PR 23-SEP-1999; 99US-0155493P. 

PR 23-SEP-1999; 99WO-US020881 . 

PR 13-OCT-1999; 99US-00416901 . 

PR 06-DEC-1999; 99US-0169232P . 
XX 

PA (PHAA ) PHARMACIA & UPJOHN CO. 
XX 

PI Bienkowkski MJ, Gurney M; 
XX 

DR WPI; 2001-444208/48. 
XX 

PT Polypeptide comprising fragments of human aspartyl protease with amyloid 

PT precursor protein processing activity and alpha-secretase activity, for 

PT identifying modulators useful in treating Alzheimer's disease. 
XX 

PS Example 14; Page 155-156; 187pp; English. 
XX 

CC The patent discloses human aspartyl protease 1 (hu-Aspl) or modified Aspl 

CC proteins which lack transmembrane domain or amino terminal domain or 

CC cytoplasmic domain and retains alpha-secretase activity and amyloid 

CC protein precursor (APP) processing activity. The proteins of the 

CC invention are useful for assaying hu-Aspl alpha-secretase activity, which 

CC in turn is useful for identifying modulators of hu-Aspl alpha-secretase 

CC activity, where modulators that increase hu-Aspl alpha-secretase activity 

CC are useful for treating Alzheimer ! s disease (AD) which causes progressive 

CC dementia with consequent formation of amyloid plaques, neurofibrillary 

CC tangles, gliosis and neuronal loss. Hu-Aspl protease substrate is useful 

CC for assaying hu-Aspl proteolytic activity, by contacting hu-Aspl protein 

CC with the substrate under acidic conditions and determining the level of 

CC hu-Aspl proteolytic activity. The present sequence is human Asp 1 protein 

CC lacking a transmembrane (TM) domain and containing (His) 6 tag. This 

CC sequence is generated from human Asp 1 protein by the deletion of its C- 

CC terminal TM domain and addition of hexa-histidine tag at its C-terminus 
XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 4; Length 518; 
Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARAL LLP LLAQWLL RAAP E LAP AP FT L P L RVAAATN RWAPT P G P GT P AE RHAD G L 60 

I I M I I I I II I I I I I I I II II M I I I I I I I I I I I > I M I I I II I I I I I I I I I I I I I I I M 

Db 1 MGALARAL LL P LLAQWLL RAAP E LAP AP FT L P LRVAAATN RWAPT P G P GT P AE RHADG L 60 

Qy 61 ALALEPALAS PAGAANFLAMVT)NLQGDSGRGYYLE^LIGTPPQKLQILVDTGSSNFAVAG 120 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 



Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLWIPKGFNTSFLWIATI 180 

I M | | | | | | | | | I I I I I I I I I I I I I I I I I I I II I II I I I I II I I I I M I I I I I I I I I M I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

|| | | | | || | || I I I I I I I M I I I I I I I I II I I I I I I I I I I I I M I I I I I I M I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

|| | | || | | | I I I I I I I I II I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I || I I II I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQWFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N C VP AQ S L S E P I LW I VS YALMS VCG 480 

I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VASNCVP AQ S L S E P I LW I VS YALMS VCG 480 

Qy 481 AI LLVLI VLLLLPFRCQRRPRDPEWNDES S LVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 481 AILLVL I VLLLLPFRCQRRPRDPEWNDESS LVRHRWK 518 



RESULT 10 
AAE06858 

ID AAE06858 standard; protein; 518 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 



AAE06858; 

23-OCT-2001 (first entry) 

Human aspartyl protease 1 (Hu-Aspl) protein. 

Human; aspartyl protease 1; Asp 1; beta-amyloid precursor protein; APP; 
beta-secretase; Alzheimer f s disease; dementia; amyloid plaque; gliosis; 
neurofibrillary tangle; neuronal loss; amyloid-beta peptide; nootropic; 
neuroprotective; antisense therapy; gene therapy; chromosome 21. 



Homo sapiens . 
Key 

Peptide 
Protein 
Domain 

WO200150829-A2 
19-JUL-2001. 



Location/Qualifiers 
1. .20 

/label= Signal_peptide 
21. .518 

/note= "Mature human aspartyl protease 1 (Hu-Aspl) " 
469. .492 

/labels Transmembrane domain 



XX 

PF 09-MAY-2001; 2001WO-IB0007 99 . 
XX 

PR 09-MAY-2001; 2001WO-IB000799 . 
XX 

PA (BIEN/) BIENKOWSKI M J. 

PA (GURN/) GURNEY M E. 

PA (HEIN/) HEINRIKSON R L. 

PA (PARO/) PARODI L A. 

PA (YANR/) YAN R. 

XX 

PI Bienkowski MJ, Gurney ME, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2001-483072/52. 

DR N-PSDB; AAD13020. 
XX 

PT Novel purified polypeptide comprising fragment of mammalian aspartyl 

PT protease 2, lacking Asp2 transmembrane domain and retaining beta 

PT secretase activity of Asp2 useful for identifying inhibitors of Asp2 

PT activity. 

XX 

PS Example 2; Fig 1; 185pp; English. 
XX 

CC The invention relates to 'human aspartyl proteases (Hu-Asp) , beta-amyloid 

CC precursor protein (APP) isoforms and their corresponding DNA molecules . 

CC Human aspartyl proteases can act as beta-secretase proteases useful for 

CC treating Alzheimer's disease. APP isoforms are useful for identifying 

CC modulators of amyloid-beta peptide production, for use in designing 

CC therapeutics for the treatment and prevention of Alzheimer's disease, 

CC dementia, formation of amyloid plaques, neurofibrillary tangles, gliosis 

CC and neuronal loss. APP isoforms are also used in methods for identifying 

CC inhibitors and modulators of human Asp2 activity. The invention relates 

CC to a method for identifying agents that modulate the activity of human 

CC aspartyl protease Asp2 . Amyloid-beta peptides obtained from APP are used 

CC as a means to screen in cellular assays for the inhibitors of beta- and 

CC gamma- secretase. Hu-Asp DNA fragments are useful as probes or primers in 

CC polymerase chain reactions (PCR) . The probes are useful for detecting Hu- 

CC Asp nucleic acids in in vitro assays and in Northern and Southern blots. 

CC The present sequence is human aspartyl protease 1 (Hu-Aspl) . Hu-Asp 1 

CC gene is localised on chromosome 21 
XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARAL LLP LLAQWLLRAAP E LAP AP FT L P LRVAAATN RWAP T P G P GT P AE RHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I M 
Db 1 MGALARAL L L P L LAQ WL L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD G L 60 

Qy 61 ALALE P ALAS P AGAAN FLAMVDNLQGD S GRG YYLEML I GT P PQKLQ I LVDT G S S N FAVAG 120 

I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDA/TVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I 



Db 



121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 



Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I II I I I I II I I I I I I I I II I I I I I M I I I I I I I I I I I I I I M I I M I II I I I 

Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I II I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I M | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 361 YLRDEN S S RS FRIT I LPQLYI QPMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVAS NCVP AQ S L S E P I LW I VS YALMS VCG 480 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VASNCVP AQ S L S E P I LW I VS YALMS VCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

Db 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 11 


AAE02608 


ID 


AAE02608 standard; protein; 518 AA. 


XX 




AC 


AAE02608; 


XX 




DT 


10-AUG-2001 (first entry) 


XX 




DE 


Human Aspartyl protease-1 (Asp-1) deltaTM (His) 6 protein. 


XX 




KW 


Human; alpha-secretase; amyloid precursor protein; APP; therapy; 


KW 


Alzheimer's disease; antialzheimer 1 s ; aspartyl protease 1; Aspl; 


KW 


beta-secretase; Asp-1 deltaTM (His) 6 protein. 


XX 




OS 


Homo sapiens . 


OS 


Synthetic. 


XX 




PN 


WO200123533-A2. 


XX 




PD 


05-APR-2001. 


XX 




PF 


22-SEP-2000; 2000WO-US026080 . 


XX 




PR 


23-SEP-1999; 99US-01554 93P . 


PR 


2 3-SEP-1999; 99WO-US020881 . 


PR 


13-OCT-1999; 99US-004 16901 . 


PR 


06-DEC-1999; 99US-0169232P . 


XX 




PA 


(PHAA ) PHARMACIA & UPJOHN CO. 


XX 





PI Gurney M, Bienkowski MJ; 
XX 

DR WPI; 2001-290516/30. 
XX 

PT Enzymes that cleave the alpha-secretase site of the amyloid precursor 

PT protein, useful for the treatment of Alzheimer f s disease. 

XX 

PS Example 14; Page 183-184; 189pp; English. 
XX 

CC The present invention relates to enzymes for cleaving the alpha- 

CC secretase site of the amyloid precursor protein (APP) and methods of 

CC identifying those enzymes. The methods may be used to identify enzymes 

CC that may be used to cleave the alpha-secretase cleavage site of the APP 

CC protein. The enzymes may be used to treat or modulate the progress of 

CC Alzheimer's disease. The present sequence is human Aspartyl protease-1 

CC (Asp-1) deltaTM (His) 6 protein which is used for the expression of pre- 

CC pro-human- Aspartyl protease 1 (Aspl) . This protein is obtained by 

CC replacing C-terminal transmembrane and cytoplasmic domains with a 

CC hexahistidine purification tag in the human Aspartyl protease 1 

XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRVVAPT^ 60 

I I I I I I I I I I I I I I I > I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I M I I I I I I I I 

MGALARALLL P L LAQWLLRAAP E LAPAP FT L P L RVAAATN RWAP T P G P GT P AERHADGL 60 

ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGT PPQKLQI LVDTGS SNFAVAG 120 

TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 18 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

ToucivTnTVTrnTirRq.qTYRqK^Fn^ 180 



I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I I I 



GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I | I I I I I I I I I I I I I I I I II I I I LI I M I I I I I I I I I I I I I I M I I I I I I I I M I M I I I 
GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 
I I M M I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I II I I I I I I I I M 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

YLRDEN S S RS FRITILPQLYI QPMMGAGLNYECYRFGI S P STNALVI GATVMEGFYVI FD 420 

RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 
M | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I i I I I I I I I I I 



Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I M I I I I I I M I I I I I I I I I II I I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 12 
AAE02580 



ID 


AAE02580 standard; protein; 518 AA. 


XX 








AC 


AAE02580; 






XX 








DT 


10-AUG-2001 


(first entry) 


XX 








DE 


Human aspartyl 


protease 1 (Asp 1) . 


XX 








KW 
rvvv 


Human; alpha- 


secretase; amyloid precursor protein 


KW 


Alzheimer's disease; antialzheimer ' s ; aspartyl pr< 


KW 


beta-secretase; 


chromosome 21. 


XX 








OS 


Homo sapiens. 






XX 








FH 


Key 




Location/Qualifiers 


FT 


Peptide 




1. .20 


FT 






/label= Signal peptide 


FT 


Peptide 




22. .62 


FT 






/label= Asp 1 prepropeptide 


FT 


Peptide 




23. .62 


FT 






/label= Asp 1 propeptide 


FT 


Protein 




63. .518 


FT 






/label= Mature human_Asp_l_protei 


FT 






/note= "Specifically claimed" 


FT 


Active-site 




87. .89 


FT 






/labels Active_site__l 


FT 


Active-site 




110. .113 


FT 






/label= Active site_2 


FT 


Active-site 




303. .305 


FT 






/label= Active site 3 


FT 


Domain 




469. .492 


FT 






/label=. Transmembrane domain 


FT 


Domain 




493. .518 


FT 






/label= Cytoplasmic domain 


FT 


Region 




497. .518 


FT 






/note= "Peptide #1" 


XX 








PN 


WO200123533-A2 


• 


XX 








PD 


05-APR-2001. 






XX 








PF 


22-SEP-2000; 


2000WO-US026080. 


XX 








PR 


23-SEP-1999; 




99US-0155493P. 


PR 


23-SEP-1999; 




99WO-US020881. 


PR 


13-OCT-1999; 




99US-00416901. 


PR 


06-DEC-1999; 




99US-0169232P. 


XX 








PA 


(PHAA ) PHARMACIA & UPJOHN CO. 



XX 

PI Gurney M, Bienkowski MJ; 
XX 

DR WPI; 2001-290516/30. 

DR N-PSDB; AAD06738. 
XX 

PT Enzymes that cleave the alpha-secretase site of the amyloid precursor 

PT protein, useful for the treatment of Alzheimer's disease. 

XX 

PS Claim 29; Fig 1; 189pp; English. 
XX 

CC The present invention relates to enzymes for cleaving the alpha- 

CC secretase site of the amyloid precursor protein (APP) and methods of 

CC identifying those enzymes. The methods may be used to identify enzymes 

CC that may be used to cleave the alpha-secretase cleavage site of the APP 

CC protein. The enzymes may be used to treat or modulate the progress of 

CC Alzheimer's disease. The present sequence is human aspartyl protease 1 

CC (Asp 1) . Asp 1 has alpha-secretase protease and beta-secretase protease 

CC activities. Asp 1 gene is located on chromosome 21 
XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


1 


MGAL7VRALLLPLLAQWLLR7^APELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
MGALARALLL P LLAQWL L RAAP ELAP AP FTL P L RVAAATN RWAPT PGP GT PAERHADGL 


60 


Db 


1 


60 


Qy 


61 


ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 

I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 

ALALEPALAS PAGAANFLAMVT)NLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 


120 


Db 


61 


120 


Qy 


121 


TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVT1PKGFNTSFLVNIATI 

M | | | 1 1 1 1 1 M | | M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M II II 1 1 II 1 1 1 1 

TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 


180 


Db 


121 


180 


Qy 


181 


FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 

I | | | | | | | | 1 1 1 II 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLWQANIPNVFSMQMCGAGLPVA 


240 


Db 


181 


240 


Qy 


241 


GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 

I | | | | | | | 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 
GS GTNGGS LVLGGI EP SLYKGDIWYT P I KEEWYYQI EI LKLEI GGQS LNLDCRE YNADKA 


300 


Db 


241 


300 


Qy 


301 


IVDSGTTLLRLPQKVFDAVWAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 

I 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 

IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 


360 


Db 


301 


360 


Qy 


361 


YLRDENS SRS FRI TIL PQL YI QPMMGAGLNYEC YRFGI S P STNAL VI GATVMEGFYVI FD 
I I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 
YLRDENS SRS FRITI LPQLYIQPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FD 


420 


Db 


361 


420 


Qy 


421 


RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVASN CVP AQ S L S E P I LW I VS YALMS VCG 

I I I I 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 


480 


Db 


421 


480 



481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

RESULT 13 
AAU29059 

ID AAU29059 standard; protein; 518 AA. 
XX 

AC AAU29059; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Human PRO polypeptide sequence #36. 
XX 

KW PRO polypeptide; mammal; tumour; cancer; human; cattle; horse; sheep; 

KW dog; cat; pig; goat; rabbit; tumour necrosis factor alpha; TNF-alpha; 

KW blood; chondrocyte cell; cell proliferation; cell differentiation; colon; 

KW adrenal; lung; breast; prostate; rectum; cervix; liver; genetic disorder. 



XX 














Homo sapiens 


* 




vv 












rJN 


WO200168848- 


AO 




v - V 
AA 












CU 


20- 


-SEP- 


2001. 






VY 
AA 












iz r 


28- 


-FEB- 


2001; 


? nn two- 


-US006520 


vv 












PR 


01- 


-MAR- 


2000; 


C+ \J \J \J V* W 


-US005601 


nn 
c K 


02- 


-MAR- 


2000; 






PR 


03- 


-MAR- 


2000; 


2000US- 


-0187202P. 


PR 


06- 


-MAR- 


2000; 


2000US- 


-0186968P. 


PR 


14- 


-MAR- 


2000; 


2000US- 


-0189320P. 


PR 


14- 


-MAR- 


■2000; 


2000US- 


-0189328P. 


PR 


15- 


-MAR- 


2000; 


2000WO- 


-US006884 . 


PR 


21- 


-MAR- 


■2000; 


2000US- 


-0190828P. 


PR 


21- 


-MAR- 


■2000; 


2000US- 


-0191007P. 


PR 


21- 


-MAR- 


-2000; 


2000US- 


-0191048P. 


PR 


21- 


-MAR- 


-2000; 


2000US- 


-0191314P. 


PR 


28- 


-MAR- 


■2000; 


2000US- 


-0192655P. 


PR 


29- 


-MAR~ 


-2000; 


2000US- 


-0193032P. 


PR 


29- 


-MAR- 


-2000; 


2000US- 


-0193053P. 


PR 


30- 


-MAR- 


-2000; 


2 000WO- 


-US008439. 


PR 


04- 


-APR- 


-2000; 


2000US- 


-0194449P. 


PR 


04- 


-APR- 


-2000; 


2000US- 


-0194647P. 


PR 


11 


-APR- 


-2000; 


2000US 


-0195975P. 


PR 


11- 


-APR- 


-2000; 


2000US- 


-0196000P. 


PR 


11- 


-APR- 


-2000; 


2000US- 


-0196187P. 


PR 


11 


-APR- 


-2000; 


2000US- 


-0196690P. 


PR 


11 


~APR- 


-2000; 


2000US- 


-0196820P. 


PR 


18 


-APR- 


-2000; 


2000US 


-0198121P. 


PR 


18 


-APR- 


-2000; 


2000US 


-0198585P. 


PR 


25 


-APR- 


-2000; 


2000US 


-0199397P. 


PR 


25 


-APR- 


-2000; 


2000US 


-0199550P. 


PR 


25 


-APR- 


-2000; 


2000US 


-0199654P. 


PR 


03 


-MAY- 


-2000; 


2000US 


-0201516P. 



Qy 

Db 



PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

XX 

PA 

XX 

PI 

PI 

XX 

DR 

DR 

XX 

PT 

PT 

PT 

XX 

PS 

XX 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

XX 

SQ 



17- 


-MAY- 


2000; 


2 000WO- 


-US013705. 


22 


-MAY- 


2000; 


2 000WO- 


-US014042. 


30 


-MAY- 


2000; 


2000WO 


-US014941. 


02 


-JUN- 


2000; 


2000WO- 


-US015264. 


05 


-JUN- 


2000; 


2000US- 


-0209832P. 


28 


-JUL- 


2000; 


2000WO- 


-US020710. 


22 


-AUG- 


2000; 


2000US- 


-00644848. 


24 


-AUG- 


2000; 


2000WO- 


-US023328. 


08 


-NOV- 


2000; 


2000WO- 


-US030952. 


01 


-DEC- 


■2000; 


2000WO 


-US032678. 


20 


-DEC- 


•2000; 


2000WO- 


-US034956. 



(GETH ) GENENTECH INC. 

Baker KP, Chen J, Desnoyers L, Goddard A, Godowski PJ, Gurney AL; 
Pan J, Smith V, Watanabe CK, Wood WI, Zhang Z; 

WPI; 2001-602746/68. 
N-PSDB; AAS45960. 

Novel nucleic acids encoding PRO polypeptides, used to diagnose the 
presence of tumors, such as prostate and breast tumors, in mammals and to 
screen for modulators of the compounds. 

Claim 11; Fig 72; 774pp; English. 

Sequences AAU29024-AAU2 9328 represent PRO polypeptides of the invention. 
The PRO polypeptides and their associated nucleic acids can be used to 
detect the presence of a tumour in a mammal by comparing the level of 
expression of a PRO polypeptide in a test sample of cells from the animal 
and a control sample of normal cells, whereby a higher level of 
expression in the test sample indicates the presence of a tumour in the 
mammal. Mammals include dogs, cats, cattle, horses, sheep, pigs, goats 
and rabbits but are preferably human. The polypeptides can be used to 
stimulate tumour necrosis factor (TNF) alpha release from human blood, 
when contacted with it. A specific polypeptide can be used to stimulate 
the proliferation or differentiation of chondrocyte cells. The PRO 
proteins can be used to determine the presence of tumours and also 
susceptibility to tumour development, particularly adrenal, lung, colon, 
breast, prostate, rectal, cervical, or liver tumours, in mammalian 
subjects. The oligonucleotide probes specific for the PRO nucleic acids 
can be used for genetic analysis of individuals with genetic disorders 

Sequence 518 AA; 



Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 

Db 



1 MGALARAL LL P L LAQWL L RAAP E LAP AP FT L P L RVAAATN RVVAP T P G P GT P AERHADGL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I 

1 MGAIJU^LLPLLAQWLLR^PELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I II 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTPPQKLQI LVDTGS SNFAVAG 12 0 



121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLWIPKGFNTSFLWIATI 180 

I I M I I I I I I M I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I 
121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

| | | | I M I I I I I I I I I I I I I I I I I I II I I I II I I I II I I I I I I I I I I I I I I I I I I I I II I 
301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I || I I I I I I I I I I II I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I II I I 

361 YLRDENS S RS FRI TI LPQLYI QPMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

I | | | I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVASN CVP AQ S L S E P I LW I VS YALMS VC G 480 

481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
481 AI LLVLI VLLLLPFRCQRRPRDPEWNDES S LVRHRWK 518 



RESULT 14 




AAU06602 




ID 


AAU06602 standard; protein; 518 AA. 




XX 






AC 


AAU06602; 




XX 






DT 


24-OCT-2001 (first entry) 




XX 






DE 


Human Aspartyl protease 1 (Aspl) . 




XX 






KW 


Human; Aspartyl protease; Aspl; Asp2; beta- 


secretase; nootropic; 


KW 


neuroprotective; amyloid protein precursor; 


APP; Alzheimer's disease; 


KW 


amyloid-beta; Abeta. 




XX 






OS 


Homo sapiens. 




XX 






PN 


WO200149098-A2. 




XX 






PD 


12-JUL-2001. 




XX 






PF 


09-MAY-2001; 2 001WO-IB0007 98 . 




XX 






PR 


09-MAY-2001; 2 001WO-IB0007 98 . 




XX 






PA 


(BIEN/) BIENKOWSKI M J. 




PA 


(GURN/) GURNEY M E. 




PA 


(HEIN/) HEINRIKSON R L. 




PA 


(PARO/) PARODI L A. 





Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



PA (YANR/) YAN R. 
XX 

PI Bienkowski MJ, Gurney ME, Heinrikson RL, Parodi LA, Yan R; 
XX 

DR WPI; 2001-502549/55. 

DR N-PSDB; AAS11516. 
XX 

PT Novel purified polypeptide comprising fragment of mammalian aspartyl 

PT protease 2, lacking Asp2 transmembrane domain and retaining beta 

PT secretase activity of Asp2 useful for identifying inhibitors of Asp2 

PT activity. 

XX 

PS Example 2; Fig 1; 185pp; English. 
XX 

CC The invention relates to a purified polypeptide comprising a fragment of 

CC mammalian aspartyl protease (Asp) 2 protein which lacks the Asp2 

CC transmembrane domain and the Asp2 protein, and where the polypeptide and 

CC the fragment retain the beta-secretase activity of the mammalian Asp2 

CC protein. The invention also details polynucleotides for the Asp proteins 

CC and vectors expressing them, and a polypeptide (isoform of amyloid 

CC protein precursor (APP) ) comprising the amino acid sequence of an APP or 

CC its fragment containing an APP cleavage site recognizable by a mammalian 

CC beta-secretase, and further comprising two lysine residues at the 

CC carboxyl terminus of the amino acid sequence of the mammalian APP or APP 

CC fragment. Also included in the invention are methods of identifying 

CC modulators or inhibitors of Asp2 ■ Modulators and inhibitors of Asp2 are 

CC useful for treating Alzheimer's disease. APP is useful in methods for 

CC identifying inhibitors or modulators of human Asp2 activity and amyloid- 

CC beta (Abeta) peptide production. APP is also useful in designing 

CC therapeutics for the treatment or prevention of Alzheimer's disease. APP 

CC comprising the APP-Sw-beta-secretase peptide sequence (NLDA) , which is 

CC associated with increased levels of Abeta processing is useful in assays 

CC relating the Alzheimer's research. The expression vector is useful for 

CC recombinantly expressing APP. Mucleic acids that hybridise to Asp 

CC oligonucleotides are useful as probes or primers. The probes are useful 

CC for detecting Hu-Asp nucleic acids in in vitro assays and in Northern and 

CC Southern blots. The present sequence is human Aspl 

XX 

SQ Sequence 518 AA; 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MGALARALLLPLLAQWLLRAAPELAPAP FTLPLRVAAATNRVVAPTPGPGTPAERHADGL 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

MGALARALLLPLLAQWLLRAAPELAPAP FTLPLRVAAATNRWAPTPGPGTPAERHADGL 


60 


Db 


1 


60 


Qy 


61 


ALALEPALASPAGAANFIJWDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 
I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 
ALAL E P ALAS P AGAAN FLAMVDN LQ GD S G RG Y YL EML I GT P P Q KLQ I LVDT G S S N FAVAG 


120 


Db 


61 


120 


Qy 


121 


TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 

I | I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 II II 1 1 1 1 1 M II 1 1 1 1 
TPHSYIDTYFDTERSSTYRSKGFDWVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 


180 


Db 


121 


180 



Qy 



181 FESENFFLPGIKWNGILGIAYATIAKPSSSLETFFDSLWQANIPNVFSMQMCGAGLPVA 240 



I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I I I 

Db 181 FESENFFLPGI KWNG I L G LAYAT LAKP SSSLETFFDS LVT QAN I PN VFSMQMC GAGL PVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I M I II 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M II I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I I I I I I II Ml I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I i I 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N CVP AQ S L S E P I LWI VS YALMS VCG 480 

I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVASN CVP AQ S L S E P I LW I VS YALM S VCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 15 


ABB06531 


ID 


ABB06531 standard; protein; 518 AA. 


XX 




AC 


ABB06531; 


XX 




DT 


31-MAY-2002 (first entry) 


XX 




DE 


Human aspartyl protease 1 protein SEQ ID NO: 125. 


XX 




KW 


Beta-secretase; enzyme; cleavage site; amyloid protein precursor; APP; 


KW 


aspartyl protease; neuroprotective; nootropic; beta-secretase inhibitor; 


KW 


Alzheimer's disease. 


XX 




OS 


Homo sapiens . 


XX 




PN 


WO200206306-A2. 


XX 




PD 


24-JAN-2002. 


XX 




PF 


19-JUL-2001; 2001WO-US023035 . 


XX 




PR 


19-JUL-2000; 2 000US- 02 1 97 95P . 


PR 


12-MAR-2001; 2001US-0275251P . 


XX 




PA 


(PHAA ) PHARMACIA & UPJOHN CO. 


XX 




PI 


Yan R, Tomasselli AG, Gurney ME, Emmons TL, Bienkowski MJ; 


PI 


Heinrikson RL; 


XX 




DR 


WPI; 2002-216995/27. 


XX 




PT 


Novel substrates for human aspartyl protease useful for identifying 



PT 
PT 
XX 
PS 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
XX 
SQ 



modulators of beta secretase activity of aspartyl protease for treating 
Alzheimer 1 s disease. 

Disclosure; Page 161-162; 188pp; English. 

The present invention describes an isolated peptide (I) comprising a 
sequence of at least four amino acids, where the peptide is a substrate 
for conducting aspartyl protease assays. (I) has neuroprotective and 
nootropic activities, and can be used as an inhibitor of beta-secretase 
activity. A beta-secretase modulator from the present invention can be 
used for inhibiting beta-secretase activity in vivo, and in the 
manufacture of a medicament for the treatment of Alzheimer's disease. 
Pharmaceutical compositions from the present invention can be used for 
treating a disease or condition characterised by an abnormal beta- 
secretase activity. (I) is useful for identifying agents that modulate 
the activity of human Asp2 aspartyl protease (Hu-Asp2) . (I) is useful as 
a core structure to construct derivatives. ABL49914 to ABL49925 and 
ABB06409 to ABB06593 represent sequences used in the exemplification of 
the present invention 

Sequence 518 AA; 



Query Match 100.0%; Score 2 687; DB 5; Length 518; 

Best Local Similarity 100.0%; Pred. No. 8.6e-231; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



QY 
Db 

Qy 

Db 



1 MGALARAL L L P LLAQWL LRAAP ELAP AP FT L P LRVAAATN RWAP T P G P GT P AE RHADGL 60 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

1 MGALARALLL P LLAQWL L RAAP ELAP AP FT L P L RVAAATN RWAP T P G P GT PAE RHADG L 60 

61 ALALEPALAS PAGAANFIAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 



QY 
Db 

Qy 

Db 

Qy 

Db 



121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I 

121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I M I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLE1GGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



Qy 

Db 

Qy 

Db 



301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I II I I 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I | I I I I I I I I I I I II I II I I I I I I I I I I I I I I M I I I I II M I I I I I I I I I I I I I I I I I I 

361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 



Qy 421 RAQ KRVG FAAS P CAE I AGAAVS E I S G P F S T EDVAS NCVP AQ S L S E P I LW I VS YALMS VC G 480 

I I I I I I I I I I I I I I II I I I I I II II I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

Db 421 RAQ KRVG FAAS P CAE I AGAAVS E I S G P F S T ED VAS N C VP AQ SLSEPILWI VS YALMS VC G 480 



Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I M I I I I I I I I I I II I I I M I I I I I I I 

Db 4 81 AILLVLIVLLLLPFRCQRRPRDPEVWDESSLVRHRWK 518 

Search completed: March 4, 2004, 15:35:42 
Job time : 107.702 sees 



Copyright 



GenCore 
(c) 1993 



version 
- 2004 



5.1.6 

Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: March 4, 2004, 15:31:20 ; Search time 33.6149 Seconds 

(without alignments) 
795.548 Million cell updates/sec 



Title: US-09-668-314C-2 
Perfect score: 2687 

Sequence: 1 MGALARALLLPLLAQWLLRA RPRDPEWNDESSLVRHRWK 518 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 389414 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB.pep:* 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3 : /cgn2_6/ptodata/2/iaa/ 6A_COMB. pep : * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5: /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep:* 

6: /cgn2_6/ptodata/2/iaa/backfilesl.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-08-999-723-2 

; Sequence 2, Application US/08999723A 

; Patent No. 6025180 

; GENERAL INFORMATION: 

; APPLICANT: Powell, David J. 

APPLICANT: Southan, Christopher 
; APPLICANT: Chapman, Conrad G. 
; APPLICANT: Evans, Joanne R. 
; TITLE OF INVENTION: ASP1 
; FILE REFERENCE: GH702 62 

; CURRENT APPLICATION NUMBER: US/08/999, 723A 

; CURRENT FILING DATE: 1997-10-06 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 2 

; LENGTH: 518 

; TYPE: PRT 



; ORGANISM: Homo sapiens 
US-08-999-723-2 

Query Match 100.0%; Score 2687; DB 3; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.2e-243; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALL L P L LAQWL L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MGALARAL L L P L LAQW L L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I II I I I I I I I I M I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FES EN FFL P GI KWNG I L GLAYAT LAK PSSSLETFFDS LVT QAN I PNVF SMQMC GAGL P VA 240 

I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVAJUVSLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I II II I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQKRVG FAAS P CAE I AGAAVS EISGPFST EDVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS Y ALM S VC G 480 

Qy 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 2 
US-09-434-427-2 

; Sequence 2, Application US/09434427 

; Patent No. 6162630 

; GENERAL INFORMATION: 

; APPLICANT: POWELL, DAVID J. 

; APPLICANT: SOUTHAN, CHRISTOPHER 

; APPLICANT : CHAPMAN, CONRAD G. 

; APPLICANT: EVANS, JOANNE R. 

; TITLE OF INVENTION: ASP1 

FILE REFERENCE: GH-7 02 62-D1 
; CURRENT APPLICATION NUMBER: US/09/434,427 
; CURRENT FILING DATE: 1999-11-04 



; EARLIER APPLICATION NUMBER: US 08/999,723 

; EARLIER FILING DATE: 1997-10-06 

; EARLIER APPLICATION NUMBER: UK 9626022.9 

; EARLIER FILING DATE: 1996-12-14 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 2 

LENGTH: 518 
; TYPE: PRT 

; ORGANISM: HOMO SAPIENS 
US-09-434-427-2 

Query Match 100.0%; Score 2687; DB 3; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.2e-243; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRV7WVTNRWAPTPGPGTPAERHADGL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MGALARALLL P L LAQWL LRAAP E LAP AP FT L P LRVAAATN RWAP T P G P GT P AERHAD GL 60 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I | I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I 

61 ALALEP7VLASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I 
121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

181 FESENFFLPGI KWN G I L GLAYAT LAK P S S S LET F FD S LVT QAN I PNVF SMQMCGAGL P VA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
181 FESENFFLPGI KWN G I L GLAYAT LAKP SSSLETFFDS LVT QAN I PNVF SMQMCGAGL PVA 240 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

361 YLRDEN S S RS FRI T I LPQLYIQPMMGAGLN YECYRFGI S P STNALVI GATVMEGFYVI FD 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
361 YLRDENS S RS FRIT I LPQLYI QPMMGAGLN YECYRFGI S P STNALVI GATVMEGFYVI FD 420 

421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

I II I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I II I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I I I I 
481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 3 

US-09-548-372D-2 

; Sequence 2, Application US/09548372D 
; Patent No. 6420534 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



; GENERAL INFORMATION : 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/62801 

; CURRENT APPLICATION NUMBER: US/09/548, 372D 

; CURRENT FILING DATE: 2000-04-12 

; PRIOR APPLICATION NUMBER: US 60/155,493 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 09/404,133 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: PCT/US99/2088 1 

; PRIOR FILING DATE: 1999-09-23 

; PRIOR APPLICATION NUMBER: US 60/101,594 

; PRIOR FILING DATE: 1998-09-24 

; NUMBER OF SEQ ID NOS : 73 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 2 

LENGTH: 518 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-548-372D-2 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.2e-243; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALL L P L LAQWLLRAAP ELAP AP FT L P LRVAAATN RWAPT P G P GT PAE RHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGALARAL L L P L LAQWL L RAAP ELAP AP FT L P L RVAAATN RWAPT P G P GT PAE RHAD GL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I | | | | I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQWFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I | M I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 YLRDEN S S RS FRI T I LPQLYIQ PMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 



Qy 



421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 



1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 

Db 421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEVVNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 4 

US-09-548-367D-2 

Sequence 2, Application US/09548367D 
Patent No. 6440698 
GENERAL INFORMATION: 
APPLICANT: GURNEY ET AL. 
; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRET AS E f APP SUBSTRATES THEREFOR 
AND USES 

TITLE OF INVENTION: THEREOF 
FILE REFERENCE: 29915/6280H 

CURRENT APPLICATION NUMBER: US/ 09/54 8 , 367D 
CURRENT FILING DATE: 2000-04-12 
PRIOR APPLICATION NUMBER: US 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/2 08 8 1 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 2 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-548-367D-2 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.2e-243; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGAIJVRALLLPLLAQWLLRAAPEIAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 1 MGAIAR7VLLLPLIAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GT P PQKLQI LVDTGS SN FAVAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ALALEP7VLASPAGAANFIAMVDNLQGDSGRGYYLEMLIGTPPQKLQI LVDTGS SNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

M II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 



Qy 



241 



GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 

Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWE^VVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAVVEAVARASLI PEFS DGFWTGSQLACWTNSETPWS YFPKI S I 360 

Qy 361 YLRDENS S RS FRIT I LPQL YI QPMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I 
Db 361 YLRDENSSRSFRITI LPQL YIQPMMGAGLNYEC YRFGI SP STNALVI GAT VMEGFYVIFD 420 

Qy 421 RAQKRVG FAAS P CAE I AGAAVS EISGPFST EDVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I L W I VS Y ALM S VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 5 

US-09-551-853D-2 

; Sequence 2, Application US/09551853D 

; Patent No. 6500667 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 

FILE REFERENCE: 29915/6280L 
; CURRENT APPLICATION NUMBER: US/09/551, 853D 
; CURRENT FILING DATE: 2000-04-18 
; PRIOR APPLICATION NUMBER: US 60/155., 493 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 09/404,133 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/20881 

PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 60/101,594 
; PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 2 

LENGTH: 518 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-551-853D-2 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.2e-243; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERRADGL 60 

I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I 
Db 1 MGALARALLLPLI^QWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 



I I I I I I I I I I I I I I II I I I I I I I I I I I i I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I II I I I I I II I I I I I I I I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLWIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I II I I I 

Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I || M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVT3SGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRI T I LPQLYIQPMMGAGLN YECYRFGI S P STNALVI GATVMEGFYVT FD 420 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 361 YLRDENS 5 RS FRI T I LPQLYI QPMMGAGLN YECYRFGI S P STNALVI GATVMEGFYVT FD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVAS NCVPAQ S L S E P I LW I VS YALMS VCG 480 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 421 RAQKRVG FAAS PCAEI AGAAVS EI SGPFSTEDVASNCVPAQSLSEP I LWIVSY7VLMS VCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEVVNDESSLVRHRWK 518 



RESULT 6 

US-09-215-450-19 

; Sequence 19, Application US/09215450 
; Patent No. 6635748 
; GENERAL INFORMATION: 

APPLICANT : Giese, Klaus 
; APPLICANT: Xin, Hong 

; TITLE OF INVENTION: METASTATIC BREAST AND COLON CANCER REGULATED GENES 

FILE REFERENCE: 1451.100 / 210030.447 
; CURRENT APPLICATION NUMBER: US/09/215, 450 
; CURRENT FILING DATE: 1998-12-17 

NUMBER OF SEQ ID NOS : 27 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 19 

LENGTH: 518 

TYPE: PRT 
; ORGANISM: human 
US-09-215-450-19 

Query Match 100.0%; Score 2687; DB 4; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.2e-243; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARAL L L P L LAQWL L RAAP E LAP AP FT L P LRVAAATN RWAP T P G P GT P AERHADG L 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 



Db 



1 MG ALARAL LL P L LAQ WL L RAAP E LAP AP FT L P L RVAAAT N RWAP T P G P G T P AE RHAD G L 60 



Qy 61 ALALEPALAS PAGAANFLAMVDNLQGD S GRGYYLEMLI GT P PQKLQI LVDTGS SNFAVAG 120 

I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I 

Db 61 ALALEPALAS PAGAANFLAMVDNLQGDS GRGYYLEMLI GT P PQKLQI LVDTGS SN FAVAG 12 0 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II II I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 24 0 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GS GTNGGSLVLGGI EPSLYKGDIWYTP I KEEWYYQI EI LKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 241 GSGTNGGS LVLGGI EPSLYKGDIWYTP I KEEWYYQI EI LKLEI GGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRI T I LPQL YI QPMMGAGLNYEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 361 YLRDENS SRS FRI TILPQLYIQPMMGAGLNYECYRFGISP STNALVI GATVMEGFYVI FD 420 

Qy 421 RAQKRVG FAAS P CAE I AGAAVS EISGPFSTE DVASN CVP AQ S L S E P I LW I VS YALMS VC G 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 421 RAQKRVG FAAS P CAE I AGAAVS EISGPFST EDVAS NCVP AQ S L S E P I LWI VS YALMS VC G 480 

Qy 4 81 AI LLVLI VLLLLP FRCQRRPRD PEWNDES S LVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 AILLVL I VLLLLPFRCQRRPRDPEWNDESS LVRHRWK 518 



RESULT 7 
US-09-717-432-2 

Sequence 2, Application US/09717432 
Patent No. 6291223 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



ZHU, YUAN 
LI, XIAOTONG 
CHRISTIE, GARY 
POWELL, DAVID J. 
TITLE OF INVENTION: Mouse Aspartic Secretase-1 (mASPl) 
FILE REFERENCE: GP-70663 

CURRENT APPLICATION NUMBER: US/09/717, 432 
CURRENT FILING DATE: 2000-11-21 
PRIOR APPLICATION NUMBER: 60/166,974 
PRIOR FILING DATE: 1999-11-23 
NUMBER OF SEQ ID NOS : 2 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 2 
LENGTH: 514 
TYPE: PRT 

ORGANISM: MUS MUSCULUS 
US-09-717-432-2 



Query Match 89.1%; Score 2395; DB 3; Length 514; 

Best Local Similarity 88.6%; Pred. No. 5.6e-216; 

Matches 459; Conservative 20; Mismatches 35; Indels 4; Gaps 1; 

Qy 1 MG ALARALL L P L LAQ W L L RAAP E LAP AP FT L P L RVAAAT N RWAP T P G P GT P AE RHAD G L 60 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I : II I I I : I I II I I I I I 

Db 1 MG AL L RAL L L L VLAQW L L S AVP ALAP AP FT L P LQ VAGATNH RAS AVP G L GT P E L P RAD G L 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I M I I I I : I I I I I I I I I I II I I I I I I I I I I I I I I I I I I : I I I I I I I II I I I I I I 

Db 61 ALALEPVRAT ANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQILVDTGSSNFAVAG 116 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I II I I 
Db 117 APHSYIDTYFDSESSSTYHSKGFDVTVl^YTQGSWTGFVGEDLVTIPKGFNSSFLVNIATI 176 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I :: I I I I I I I I I I I I I 

Db 177 FE S EN FFL P GI KWNG I L GLAYAALAK PSSSLETFFDS LVAQAK I P D I F SMQMC GAGL P VA 236 

Qy 241 GSGTNGGSLVLGGI EPSLYKGDIWYTPI KEEWYYQI EI LKLEI GGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 

Db 237 GSGTNGGSLVLGGI EPSLYKGDIWYTPI KEEWYYQI EI LKLEIGGQNLNLDCREYN AD KA 296 

Qy 301 I VDSGTTLLRLPQKVFDAVVEAVARAS LI PEFS DGFWTGSQLACWTNSETPWS YFPKI S I 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I : I I I I I I I I I I I I : I I I I I I I 
Db 297 IVDSGTTLLRLPQKVFDAWEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISI 356 

Qy 361 YLRDENS S RS FRI TIL PQL YI Q PMMGAGLN YEC YRFGI S P S TNALVI GATVMEGFYVI FD 420 

I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I : I I 
Db 357 YLRDENASRS FRIT I LPQLYIQPMMGAGFNYEC YRFGI SSSTNALVI GAT VMEGFYWFD 416 

Qy 421 RAQKR VG FAAS P CAE I AGAAVS EISGPFSTE DVAS N C VP AQ S L S E P I LW I VS YALMS VC G 480 

111:11111 I I I I I I I I I I I I I I I I I I I : I I I I I I I I : I : I I I I I I I I I I I I I I I I 
Db 417 RAQRRVG FAVS P CAE I E GTT VS EISGPFSTEDI AS N CVP AQALN E P I LW I VS YALMS VCG 476 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

1111111:11111 I : I I II I I I I I I I I I I I I I I I 
Db 477 AILLVLILLLLLPLHCRHAPRDPEWNDESSLVRHRWK 514 



RESULT 8 
US-09-912-484-2 

; Sequence 2, Application US/09912484 

; Patent No. 6358725 

; GENERAL INFORMATION: 

; APPLICANT: Christie, Gary 

; APPLICANT: Li, Xiaotong 

; APPLICANT : Powell, David J. 

; APPLICANT: Zhu, Yuan 

; TITLE OF INVENTION: Mouse Aspartic Secretase-1 (mASPl) 

; FILE REFERENCE: GP-70663-D1 

; CURRENT APPLICATION NUMBER: US/ 09/ 912 , 4 84 

; CURRENT FILING DATE: 2001-07-25 

PRIOR APPLICATION NUMBER: 60/166,974 
; PRIOR FILING DATE: 1999-11-23 



; PRIOR APPLICATION NUMBER: 09/717,432 

PRIOR FILING DATE: 2000-11-21 
; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 2 

LENGTH: 514 

TYPE: PRT 

ORGANISM: MUS MUSCULUS 
US-09-912-484-2 



Query Match 89.1%; Score 2395; DB 4; Length 514; 

Best Local Similarity 88.6%; Pred. No. 5.6e-216; 

Matches 459; Conservative 20; Mismatches 35; Indels 4; Gaps 1; 

Qy 1 MGALARAL LL P L LAQWLL RAAP ELAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD G L 60 

I I I I M I I I : I I I I I I I I I I I I I I I I I I : I I M I : I I I I I I I I I 

Db 1 MGALLRALLLLVLAQWLLSAVPALAPAPFTLPLQVAGATNHRASAVPGLGTPELPRADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVTDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I I I I I : I I I I I I I I I I II I I I I I I I I I I I I I I I I I I : I I I I I I II II I I I I I 

Db 61 ALALEPVRAT ANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQILVDTGSSNFAVAG 116 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I : I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II : I I I I I I I I I 
Db 117 APHSYIDTYFDSESSSTYHSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATI 176 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I II I II I I :: I I I I I I I I I I I I I 
Db 177 FESENFFLPGIKWNGILGLAYAAIAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVA 236 



Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I II I I I 
Db 237 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQNLNLDCREYNADKA 296 



Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I II I I I I I I I II I I I I I I I M I I I I I I I : I I I I I I I I I I I I : I I I I I I I 
Db 297 IVDSGTTLLRLPQKVFDAWEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISI 356 

Qy 361 YLRDENS S RS FRI T I LPQLYIQPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FD 420 

I I I I I I : I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I II I II : I I 

Db 357 YLRDENAS RS FRI T I LPQL YIQPMMGAGFNYEC YRFGI S S STNALVI GATVMEGFYWFD 416 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVAS N CVP AQ S L S E P I LW I VS YALMS VCG 480 

Mhlllll I I I I I I I I I I I I I I I I M I : I I I I I I I I : I : I I I I II II I I I I I I I I 

Db 417 RAQRRVGFAVSPCAEIEGTTVSEISGPFSTEDIASNCVPAQALNEPILWIVSYALMSVCG 476 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I II I I : I M I I I : I I I I I I I I I I I I I I I I I II 

Db 477 AI LLVLI LLLLLPLHCRHAPRDPEWNDES SLVRHRWK 514 



RESULT 9 
US-09-713-158-2 

; Sequence 2, Application US/09713158 
; Patent No. 6361975 
; GENERAL INFORMATION: 
; APPLICANT: ZHU, YUAN 



APPLICANT: LI, XIAOTONG 
APPLICANT : POWELL, DAVID J. 
APPLICANT: CHRISTIE, GARY 

TITLE OF INVENTION: MOUSE ASPARTIC SECRETASE-2 
FILE REFERENCE: GP-70660 

CURRENT APPLICATION NUMBER: US/09/713, 158 
CURRENT FILING DATE: 2000-11-15 
PRIOR APPLICATION NUMBER: 60/165,800 
PRIOR FILING DATE: 1999-11-16 
NUMBER OF SEQ ID NOS: 2 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 2 
LENGTH: 501 
TYPE: PRT 

ORGANISM: MUS MUSCULUS 
US-09-713-158-2 



(MASP-2) 



Query Match 44.2%; Score 1186.5; DB 4; Length 501; 

Best Local Similarity 46.0%; Pred. No. 1.4e-102; 

Matches 238; Conservative 83; Mismatches 167; Indels 29; Gaps 8; 



Qy 



Db 



7 ALLLPLLAQWLLRAAPELAPAPFT L P L RVAAATN RWAPT P G P GT P AE RHAD GLA 61 

I I I I I : : I I I I I I I I II II : 

2 AQALPWLLLWV GSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES — 51 



Qy 



Db 



62 LALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGT 121 

I : I : 11111:1 I I : I I I : I I : I : I I I I I I I I I I I I I I I I 

52 E E P GRRG S FVEMVDN LRG KS GQG Y YVEMT VG S P P QT LN I L VDT G S S N FAVGAA 104 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



122 PHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIF 181 



I I 



I I I I 



I I I I I I I I 



I I I : I I I I 



I I 



105 PHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAIT 164 

182 ESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV — 239 

II: II: I I I I I I I I I I : I : I III I I I I I I I : I I I : II : I : I I I I I : 

165 ESDKFFINGSNWEGILGLAY7VEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQ 224 

24 0 -AGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNAD 298 



I I I : : : I 



III I : I I I I I : MM:: | : : : || II I : II : I II I 



225 TE7VLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYD 284 

299 KAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKI 358 

I : II I II II II II : I II : I I : : : II M II I I I II II I II : II I 

285 KSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVI 344 

359 SIYLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGFYV 417 



MM I 



I I I I I I I I I MM 



MM : I I : 



MM Mill 



34 5 SLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTA/MGAVIMEGFYV 404 
418 I FD RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALMS 477 



M II I M I M II I 



I I 



I I 



405 VFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAA 464 



Qy 



Db 



478 VCGAI LLVLI VLLLLP FRCQR — RPRDPEWNDESSL 512 



Ml I I : 



Mil 



465 IC-ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 10 
US-09-548-372D-8 

; Sequence 8, Application US/09548372D 

; Patent No. 6420534 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 29915/62801 

; CURRENT APPLICATION NUMBER: US/09/54 8 , 372D 
; CURRENT FILING DATE: 2000-04-12 

PRIOR APPLICATION NUMBER: US 60/155,493 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 09/404,133 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/2088 1 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: US 60/101,594 

PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 8 
; LENGTH: 501 

TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-09-548-372D-8 

Query Match 44.1%; Score 1185; DB 4; Length 501; 

Best Local Similarity 46.0%; Pred. No. 1.9e-102; 

Matches 237; Conservative 83; Mismatches 169; Indels 26; Gaps 7; 

Qy 9 LLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGTPAERHADGLALA 63 

: I I II : I I I I I I I I II I I : 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES 51 

Qy 64 LE PALAS PAGAAN FLAMVDNLQGD S GRG YYLEML I GT P PQKLQ I LVDT GS SN FAVAGT PH 123 

I : I : I I II I : I I I : I I I : I I : I : I I I I I I I I I I I I I I I I II 

Db 52 E E P GRRG S FVEMVDN L RGK S GQG Y YVEMT VG S P P QT LN I LVDT G S S N FAVGAAP H 106 

Qy 124 SYIDTYFDTERSSTYRSKGFDWVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

: : I : : I I I I I I I I I I I I I : I II I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 

Qy 184 ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 240 

: II: I I I I I I I I I I : I : I III I II I I I I : I I I : II : I : I I I I I : 

167 DKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 226 

241 GSGTNGGSLVLGGI EPSLYKGDIWYT P I KEEWYYQIEI LKLEI GGQSLNLDCREYNADKA 300 

: I I I : : : I I I : III I : I I I I I : MM:: I : : : I I II I : I I : I I I I I : 

227 ALASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKS 28 6 

Qy 301 IVDSGTTLLRLPQWFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

II I I I I I I I I I : I I I : I I :: : II : I I I I I I I I I I I I I : I I I I : 

Db 287 IVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISL 346 



Qy 

Db 



Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGFYVIF 419 

II I : : : I I I I I I I I I I : : I : : : I I : I : I I : I : I I : I I I I I I : I 

Db 347 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 406 

Qy 420 D RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALMS VC 479 

I I I : I I : I I I I I : : | | | | | : I | : : I : : : I 

Db 4 07 DRARK RI G FAVS ACHVH D E FRTAAVE G P FVTADME D C G YN I PQT D E S T LMT I AYVMAAI C 466 

Qy 4 80 GAILLVLIVLLLLPFRCQR— RPRDPEWNDESSL 512 

I : : : : I : : : I I I I : : : I I I 

Db 4 67 -ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 11 
US-09-548-367D-8 

Sequence 8, Application US/09548367D 
Patent No. 6440698 
GENERAL INFORMATION: 
APPLICANT: GURNEY ET AL. 

TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

TITLE OF INVENTION: THEREOF 
FILE REFERENCE: 29915/6280H 

CURRENT APPLICATION NUMBER: US/09/548 , 367D 
CURRENT FILING DATE: 2000-04-12 
PRIOR APPLICATION NUMBER: US 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/2 08 81 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 8 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Mus musculus 
US-09-548-367D-8 

Query Match 44.1%; Score 1185; DB 4; Length 501; 

Best Local Similarity 46.0%; Pred. No. 1.9e-102; 

Matches 237; Conservative 83; Mismatches 169; Indels 26; Gaps 7; 

Qy 9 LLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGTPAERHADGLALA 63 

: I I II : I I I I I I I I II M : 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES 51 

Qy 64 LEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPH 123 

I : I : I I I I I : I I I : I I I : II : I : I I I I I I I I I I I I II I I II 

Db 52 EEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPH 106 

Qy 124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

: : I : : I I I I I I I I I II I I : I I I I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 



Qy 184 ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 240 

: II: I I I I I I I I I I : I : I III I I I I I I I : I I I : I I : I : I II I I : 
Db 167 DKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 226 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

: I I I : : : I I I : III I : I I I I I : MM:: I : : : I I II I : | | : I I I I I = 

Db 227 ALASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKS 286 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I II I I I I I I I I : I I I : I I : : : II : I I I I I I I I I I I I I : I I I I : 

Db 287 IVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISL 34 6 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTN7VLVIGATVMEGFYVIF 419 

II I : : : I I I I I I I I I I : : I : : : I I : I : I I : I : I I : I I I I I I : I 

Db 347 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 406 

Qy 420 D RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALMS VC 479 

I I I : I I : I I I I I : : | | | | | : | I : : | : : : I 

Db 407 DRARKRIGFAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAIC 466 

Qy 480 GAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

I : : : : I : : : I I I I : : : I I I 

Db 467 -ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 12 
US-09-551-853D-8 

; Sequence 8, Application US/09551853D 

; Patent No. 6500667 

; GENERAL INFORMATION: 

; APPLICANT: GURNEY ET AL. 

; TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

TITLE OF INVENTION: THEREOF 

FILE REFERENCE: 29915/6280L 
; CURRENT APPLICATION NUMBER: US/ 09/551, 853D 

CURRENT FILING DATE: 2000-04-18 
; PRIOR APPLICATION NUMBER: US 60/155,493 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 09/404,133 
; PRIOR FILING DATE: 1999-09-23 
; PRIOR APPLICATION NUMBER: PCT/US99/20881 
; PRIOR FILING DATE: 1999-09-23 

PRIOR APPLICATION NUMBER: US 60/101,594 
; PRIOR FILING DATE: 1998-09-24 
; NUMBER OF SEQ ID NOS : 73 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 8 

LENGTH: 501 
TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-09-551-853D-8 

Query Match 44.1%; Score 1185; DB 4; Length 501; 

Best Local Similarity 46.0%; Pred. No. 1.9e-102; 

Matches 237; Conservative 83; Mismatches 169; Indels 26; Gaps 7; 



Qy 9 LLPLLAQWLLRAAPELAPAPFT L P L RVAAATN RWAPT P G P GT P AE RHAD GLALA 63 

: I I II : I I I I I I I I II II : 

Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES 51 

Qy 64 LEPALAS PAGAANFLAMVTDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPH 123 

I : I : I I I I I : I I I : I I I : I I : I : I I I I I I I I I I I II I I I II 

Db 52 E E P G RRG S FVEMVDN L RG K S GQ G Y YVEMT VG S P PQT LN I L VDT G S S N FAVGAAP H 106 

Qy 124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

: : I : : I I I I I I I I I I I I I : I I I I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 

Qy 184 ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 240 

: I I : I I I I I I I I II : I : I I I I I I I I I I I : I I I : M : I : I I I I I : 
Db 167 DKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 226 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

: I I I : : : I I I : III I : I I I I I : MM:: I : : : I I I I I : I I : I I I I I : 
Db 227 ALASVGGSMI I GGI DHS LYTGS LWYT P I RREWYYEVI I VRVEINGQDLKMDCKE YN YDKS 286 

Qy 301 IVDSGTTLLRLPQWFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I II : I I I : I I :: : II : I I I I I I II I I I I I : I I I I : ' 

Db 287 I VDSGTTNLRLPKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI S L 346 

Qy 361 YLRDENSSRS FRITI LPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGFYVI F 419 

II I : : : I I I I I I I I I I : : I : : : I I : I : I I : I : M : I I I I I I : I 

Db 347 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 406 

Qy 420 D RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVASNCVP AQ S L S E P I LW I VS YALMS VC 479 

I I I : I I : I I I I I : : | | | | | : I I : : I : : : I 

Db 407 DRARKRIGFAVSACHVHDEFRTAAVEGPFWADMEDCGYNIPQTDESTLMTIAYVMAAIC 4 66 

Qy 480 GAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

I : : : : I : : : I I I I : : : I I I 
Db 467 -ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 13 
US-09-724-566A-65 

; Sequence 65, Application US/09724566A 
; Patent No. 6627739 
; GENERAL INFORMATION: 



APPLICANT: 


Anderson, John P. 


APPLICANT: 


Basi, Guriqbal 


APPLICANT: 


Doane, Minh Tarn 


APPLICANT: 


Frigon, No. 6627739mand 


APPLICANT: 


John, Varghese 


APPLICANT: 


Power, Michael 


APPLICANT: 


Sinha, Sukanto 


APPLICANT: 


Tatsuno, Gwen 


APPLICANT: 


Tung, Jay 


APPLICANT: 


Wang, Shuwen 


APPLICANT: 


McConlogue, Lisa 



TITLE OF INVENTION: Beta-Secretase Enzyme Compositions and 
TITLE OF INVENTION: Methods 
FILE REFERENCE: 228-US-NEWC2 

CURRENT APPLICATION NUMBER: US/ 09/724 , 566A 



CURRENT FILING DATE: 2000-11-28 
PRIOR APPLICATION NUMBER: US 09/501,708 
PRIOR FILING DATE: 2000-02-10 
PRIOR APPLICATION NUMBER: 60/119,571 
PRIOR FILING DATE: 1999-02-10 
PRIOR APPLICATION NUMBER: 60/139,172 
PRIOR FILING DATE: 1999-06-15 
NUMBER OF SEQ ID NOS : 104 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 65 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Mus mus cuius 
US-09-724-566A-65 

Query Match 44.1%; Score 1184.5; DB 4; Length 501; 

Best Local Similarity 45.9%; Pred. No. 2-le-102; 

Matches 237; Conservative 84; Mismatches 170; Indels 25; Gaps 7; 

Qy 9 LLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGT PAERHADGLALA 63 

: I I II : I I I I I I I I II II : 
Db 1 MAP ALHWL L LWVG S GML P AQ GT H L G I RL P L RS GLA GPPLGLRLPRETDEES 51 

Qy 64 LEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVT)TGSSNFAVAGTPH 123 

I : I : I I I I I : I I I : I I I : M : I : I I I I I I I I I I I I I I I I II 

Db 52 EEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPH 106 

Qy 124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

: : I : : I I I I I I I I I I I I I : I M I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 

Qy 184 ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 240 

- : II: I I II I I I I I I : I : I III MINI I : I II : I I : I : I I I I I : 
Db 167 DKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 226 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

: I I I : : : I I I : III I : I I I I I : MM:: I : : : I I I I I : M : I I I II: 

Db 227 ALAS VGGSMI I GGI DHS LYTGS LWYTP I RREWYYEVI I VRVEINGQDLKMDCKE YN YDKS 286 

Qy 301 IVDSGTTLLRLPQKVFDAVV^VARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I : I I I : I I : : : II : I I I I I I I I I I I I I : I I I I : 

Db 287 IVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISL 346 

Qy 361 YLRDENS SRS FRITI LPQLYIQPMMGAGLNY- ECYRFGI S PSTNALVI GATVMEGFYVI F 419 

II I : : : I I I I I I I I I I : : I : : : I I : I : I I : I : I I : I I I I I I : I 

Db 347 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 406 

Qy 420 D RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALMS VC 479 

I II : II : I I I I I : : I I I I I : I I : : I : : : I 

Db 407 DRARKRIGFAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAIC 466 

Qy 480 GAILLVLIVLLLLPFRCQRRPR-DPEWNDESSLVR 514 

I : : : : I : : : I I I I : I : I I : : 

Db 4 67 -ALFMLPLCLMVCQWRCLRCLRHQHDDFGDDISLLK 501 



RESULT 14 



US-09-548-372D-4 

Sequence 4, Application US/09548372D 
Patent No. 6420534 
GENERAL INFORMATION: 
APPLICANT: GURNEY ET AL. 
; TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

; TITLE OF INVENTION: THEREOF 
FILE REFERENCE: 29915/62801 

CURRENT APPLICATION NUMBER: US/09/54 8 , 372D 
CURRENT FILING DATE: 2000-04-12 
PRIOR APPLICATION NUMBER: US 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-548-372D-4 

Query Match 43.9%; Score 1178.5; DB 4; Length 501; 

Best Local Similarity 46.2%; Pred. No. 7.8e-102; 

Matches 240; Conservative 82; Mismatches 164; Indels 33; Gaps 9; 

Qy 7 ALLLPLLAQWLLRAAPELAPAPFT L P LRVAAATN RWAP T P G P GT P AE RHAD GLA 61 

I I I I I : : I I I I I I I II II 
Db 2 AQAL PWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 42 

Qy 62 LALE — PALASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVA 119 

I I I : I : I I I I I : I I I : I II : I I : I : I I I I I I I I I I I I I I I I 

Db 43 LPRETDEEPEEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGSSNFAVG 102 

Qy 120 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 179 

II:: I : : I I I M I I I I I I I I : I I I I : I I I I : III 

Db 103 AAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

Qy 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

I II: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I : I : I I II I : 
Db 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPL 222 

Qy 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 296 

I : I I I : : : I I I : III I : I I I I I : MM:: | : : : | | II ! : I I : I I I 

Db 223 NQS EVLASVGGSMI I GGI DHSLYTGSLWYTPI RREWYYEVI IVRVEINGQDLKMDCKEYN 282 

Qy 297 AI)KAIVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFP 356 

I I : I I I I I I I I I It : I I I : I I :: : II : I I I I I I II I I I I I : I I 

Db 283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

Qy 357 KI S I YLRDENS S RS FRI T I LPQL YIQPMMGAGLN Y- EC YRFGI S PSTNALVI GATVMEGF 415 

I I : I I I : : : I I I I I I I I I I : : I : : : I I : I I I I : I : I I : I I I I 



Db 



343 



VI SLYLMGEVTNQS FRITI LPQQYLRPVEDVATSQDDCYKFAI SQS STGTVMGAVTMEGF 4 02 



Qy 416 YVI FDRAQKRVG FAAS P CAE I AGAAVS E I S GP FS T EDVASNCVP AQ S L S EP I LWI VS YAL 475 

I I : I I I I : I I : I II I I : : I I I I I : I I : : I : 

Db 403 YVVFDR7\RKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM 462 

Qy 476 MSVCGAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

: : I I : : : : I : : : I I I I : : : I I I 

Db 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 



RESULT 15 
US-09-548-367D-4 

Sequence 4, Application US/09548367D 
Patent No. 6440698 
GENERAL INFORMATION: 
APPLICANT: GURNEY ET AL. 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR 
AND USES 

TITLE OF INVENTION: THEREOF 
FILE REFERENCE: 29915/6280H 

CURRENT APPLICATION NUMBER: US/09/548, 367D 
CURRENT FILING DATE: 2000-04-12 
PRIOR APPLICATION NUMBER: US 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: US 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 4 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-548-367D-4 

Query Match 43.9%; Score 1178.5; DB 4; Length 501; 

Best Local Similarity 46.2%; PrecL No. 7.8e-102; 

Matches 240; Conservative 82; Mismatches 164; Indels 33; Gaps 9; 

Qy 7 ALLLPLLAQWLLRAAPELAPAPFT L P L RVAAATN RWAP T P G P GT PAE RHADGLA 61 

I III I : : I I I I I I I II II 

Db 2 AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 42 

Qy 62 LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTPPQKLQI LVDTGS SNFAVA 119 

I I I : I : I I I I I : I I I : I I I : I I : I : I I I I I I I I I I I I I I I I 

Db 43 LPRETDEEPEEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLN I LVDTGS SNFAVG 102 

Qy 12 0 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 17 9 

II:: I : : I I I I I I I I I I I I I : I I I I : I I I I : III 

Db 103 AAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNA/TVRANIAA 162 

Qy 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

I II: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I : I : I I I I I : 



Db 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPL 222 

Qy 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 2 96 

I : I I I : : : I I I : III I : I II I I : I I I I : : I : : : I I II I : I I : I I I 

Db 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

Qy 297 ADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFP 356 

I I : I I I I I I I I I I I : I I I : I I :: : II : I I I I I I I I I I I I I : I I 

Db 283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

Qy 357 KI S I YLRDENS S RS FRI T I LPQLYI QPMMGAGLN Y-EC YRFGI S P STNALVI GATVMEGF 415 

I I : I I I : : : I I I I I I I I I I : : I : : : I I : I I I I : I : I I : I I I I 

Db 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 402 

Qy 416 YVI FD RAQ KRVGFAAS P C AEI AGAAVS EISGPFSTE DVAS N CVP AQ S L S E P I LW I VS YAL 475 

I I : I I I I : II : I I I I I : : | | | | | : I I : : I : 

Db 4 03 YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM 4 62 

Qy 476 MSVCGAILLVLIVLLLLPFRCQR— RPRDPEWNDESSL 512 

: : I I : : : : I : : : I I I I : : : I I I 

Db 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 



Search completed: March 4, 2004, 15:42:13 
Job time : 34.6149 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd 



OM protein - protein search, using sw model 
Run on: March 



4, 2004, 15:39:01 ; Search time 57.8617 Seconds 

(without alignments) 
1890.324 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-668-314C-2 
2687 

1 MGALARALLLPLLAQWLLRA. 



RPRDPEWNDESSLVRHRWK 518 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



809742 seqs, 211153259 residues 



Total number of hits satisfying chosen parameters 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



809742 



Database : 



Published 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 



/ cgn2_ 
/ cgn2_ 
/cgn2__ 
/ cgn2_ 
/ cgn2_ 
/ cgn2_ 
/ cgn2_ 
/cgn2_ 
/cgn2_ 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 



__Appiica€ions__AA: ~ 
6/ptodata/2/pubpaa/US07_PUBCOMB.pep: 
6/ptodata/2/pubpaa/PCT_NEW_PUB.pep: * 
6/ptodata/2/pubpaa/US06_NEW_PUB.pep: 

a / ') /^nKr^oi /TT Q H £ 



d/ ptoaata/ z/ puopaa/ u^u o_iNii.w_.ru d . pep ; ~ 
6/ptodata/2/pubpaa/US06_PUBCOMB . pep : * 
6/ptodata/2/pubpaa/US07_NEW_PUB . pep : * 
6/ptodata/2/pubpaa/PCTUS_PUBCOMB.pep: 
6/ptodata/2/pubpaa/US08_NEW_PUB.pep: * 
6/ptodata/2/pubpaa/US08_PUBCOMB.pep: * 
6/ptodata/2/pubpaa/US09A_PUBCOMB.pep: 
_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US09C_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US09_NEW_PUB.pep: 
_6/ptodata / 2 /pubpaa /US 1 0A_PUBCOMB . pep 
_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US10C_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US10_NEW_PUB.pep: 
_6/ptodata/2/pubpaa/US60_NEW_PUB. 
_6/ptodata/2/pubpaa/US60_PUBCOMB. 



pep: 
pep: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 



Score 



% 

Query 

Match Length DB ID 



Description 



V 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 4, 2004, 15:30:05 ; Search time 28.1043 Seconds 

(without alignments) 
1772.942 Million cell updates/sec 

Title: US-09-668-314C-2 
Perfect score: 2687 

Sequence: 1 MGALARALLLPLLAQWLLRA RPRDPEWNDESSLVRHRWK 518 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 283366 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : 



1 
2 
3 
4 



PIRJ78: * 
pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


1178.5 


l 43. 


9 


501 


2 


A59090 


aspartic proteinas 


2 


367.5 


13. 


7 


383 


2 


JC7573 


pepsinogen C - Afr 


3 


363. 5 


13. 


5 


377 


1 


PEMQCJ 


gastricsin (EC 3.4 


4 


355.5 


13. 


2 


384 


2 


A39314 


gastricsin (EC 3.4 


5 


355 


13. 


2 


389 


2 


JE0371 


pepsin C (EC 3.4.2 


6 


353 


13. 


1 


388 


2 


A29937 


gastricsin (EC 3.4 


7 


351.5 


13. 


1 


388 


2 


JC7246 


pepsinogen C - com 


8 


324.5 


12. 


1 


394 


2 


B43.356 


gastricsin (EC 3.4 


9 


320 


11. 


9 


385 


2 


JC7575 


pepsinogen A - bul 


10 


320 


11. 


9 


402 


1 


REMSK 


renin (EC 3.4.23.1 


11 


313.5 


11. 


7 


509 


2 


S66516 


oryzasin (EC 3.4.2 


12 


313 


11. 


6 


392 


1 


A24608 


gastricsin (EC 3.4 


13 


310 


11. 


5 


383 


2 


A41443 


pepsin (EC 3.4.23. 



14 


308 .5 


11. 


5 


412 


1 


KHHUD 


cathepsin D (EC 3. 


15 


306.5 


11. 


4 


410 


1 


KHMSD 


cathepsin D (EC 3. 


16 


305.5 


11. 


4 


401 


1 


REMSS 


renin (EC 3.4.23.1 


17 


305 


11. 


4 


384 


2 


JC7574 


pepsinogen A - Afr 


18 


305 


11. 


4 


407 


1 


KHRTD 


cathepsin D (EC 3. 


19 


302 


11. 


2 


405 


2 


A25379 


saccharopepsin ( EC 


20 


301.5 


11. 


2 


398 


2 


S66465 


cathepsin E (EC 3. 


21 


300.5 


11. 


2 


387 


2 


C38302 


pepsin (EC 3.4.23. 


22 


299 


11. 


1 


398 


2 


151185 


cathepsin D (EC 3. 


23 


298.5 


11. 


1 


387 


2 


D38302 


pepsin (EC ' 3.4.23. 


24 


298.5 


11. 


1 


400 


2 


147099 


renin (EC 3.4.23.1 


25 


297 


11. 


1 


388 


1 


PEHU 


pepsin A (EC 3.4.2 


26 


296 


11. 


0 


388 


2 


A30142 


pepsin A (EC 3.4.2 


27 


296 


11. 


0 


388 


2 


B30142 


pepsin A (EC 3.4.2 


28 


294.5 


11. 


0 


388 


1 


S19684 


pepsin A (EC 3.4.2 


29 


292 


10. 


9 


506 


2 


T07915 


probable aspartic 


30 


291 


10. 


8 


388 


1 


S19682 


pepsin A (EC 3.4.2 


31 


291 


10. 


8 


402 


1 


RERTK 


renin (EC 3.4.23.1 


32 


291 


10. 


8 


406 


1 


REHUK 


renin (EC 3.4.23.1 


33 


290.5 


10. 


8 


396 


2 


S36865 


cathepsin E (EC 3. 


34 


289 


10. 


8 


387 


2 


E38302 


pepsin (EC 3.4.23. 


35 


288 


10. 


7 


387 


2 


B38302 


pepsin (EC 3.4.23. 


36 


288 


10. 


7 


388 


1 


PEMQAJ 


pepsin A (EC 3.4.2 


37 


287.5 


10. 


7 


632 


2 


T45858 


hypothetical prote 


38 


287 


10. 


7 


391 


2 


A43356 


cathepsin E (EC 3 . 


39 


287 


10. 


7 


396 


2 


A34401 


cathepsin E (EC 3 . 


40 


286.5 


10. 


7 


334 


2 


JC4870 


pepsin A (EC 3.4.2 


41 


286 


10. 


6 


382 


1 


PECH 


pepsin A (EC 3.4.2 


42 


286 


10. 


6 


388 


1 


PEMQAR 


pepsin A (EC 3.4.2 


43 


285.5 


10. 


6 


387 


2 


JC7245 


pepsinogen A - com 


44 


285 


10. 


6 


396 


2 


T47207 


aspartic proteinas 


45 


284.5 


10. 


6 


386 


1 


PEPG 


pepsin A (EC 3.4.2 



ALIGNMENTS 



RESULT 1 
A59090 

aspartic proteinase (EC 3.4.23.-) BACE precursor - human 

N; Alternate names: beta-secretase ; beta-site APP cleaving enzyme 

C; Species: Homo sapiens (man) 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change ll-May-2000 
C;Accession: A59090 

R;Vassar, R. ; Bennett, B.D.; Babu-Khan, S.; Kahn, S.; Mendiaz, E.A. ; Denis, P.; 
Teplow, D.B.; Ross, S.; Amarante, P.; Loeloff, R. ; Luo, Y. ; Fisher, S.; Fuller, 
J.; Edenson, S . ; Lile, J.; Jarosinski, M.A. ; Biere, A.L.; Curran, E. ; Burgess, 
T.; Louis, J.C.; Collins, F.; Treanor, J.; Rogers, G. ; Citron, M. 
Science 286, 735-741, 1999 

A;Title: beta-Secretase cleavage of Alzheimer's amyloid precursor protein by the 
transmembrane aspartic protease BACE. 

A; Reference number: A59090; MUID: 20002972 ; PMID : 10531052 
A; Note: submitted to GenBank, September 1999 
A; Accession: A59090 

A; Status: not compared with conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-501 <VAS> 



A;Cross-references: GB:AF190725; NID : g6118538 ; PIDN:AAF04142 . 1; PID:g6118539 
C; Genetics : 
A; Gene : BACE 

C; Super family : beta-secretase 

C;Keywords: Alzheimer ! s disease; aspartic proteinase; brain; glycoprotein; 

hydrolase; protein digestion; transmembrane protein; zymogen 

F; 1-21/Domain : signal sequence tstatus predicted <SIG> 

F;22-45/Domain: propeptide #status predicted <PRO> 

F;4 6-501/Product: acid proteinase BACE #status predicted <MAT> 

F;461-477/Domain: transmembrane #status predicted <TRN> 

F;93 f 289/Active site: Asp #status predicted 

F;153, 172, 223, 354/Binding site: carbohydrate (Asn) (covalent) #status predicted 
F;330-380/Disulfide bonds: tfstatus predicted 

Query Match 43.9%; Score 1178.5; DB 2; Length 501; 

Best Local Similarity 46.2%; Pred. No. 3.8e-80; 

Matches 240; Conservative 82; Mismatches 164; Indels 33; Gaps 9; 



Qy 


7 


ALLLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGTPAERHADGLA 

I 1 1 1 1 : : 1 1 1 1 1 1 1 M N 
AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 


61 


Db 


2 


42 


Qy 


62 


LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTPPQKLQI LVDTGS SNFAVA 
| | | : | : | I I I 1 : 1 1 1 : 1 1 1 : 1 1 : 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
LPRETDEEPEEPGRRGS FVEMVDNLRGKS GQGYYVEMTVGSPPQTLNI LVDTGS SNFAVG 


119 


Db 


43 


102 


Qy 


120 


GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 

| | : : | : : 1 1 1 1 1 1 1 1 M 1 1 1 - 1 M 1 : 1 1 1 i - III 

AAPHPFLHRYYQRQLSSTYRDLRKGVTVPYTQGKWEGELGTDLVSIPHGPNVTVRANIT^A 


179 


Db 


103 


162 


Qy 


180 


I FES EN FFLP GI KWNGI LGLAYATLAKP S S S LET FFDS LVTQANI PNVFSMQMCGAGLPV 

| ||: ||: I 1 1 1 1 1 1 1 II : 1 : 1 III 1 1 1 1 1 1 1 : : 1 1 : 1 1 : 1 : 1 M 1 1 : 

ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPL 


239 


Db 


163 


222 


Qy 


240 


AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 

| : | | | : :: | | | : I I 1 1 : 1 1 1 1 1 : INI:: | : : : I I II 1 : 1 1 : 1 1 1 
NQSEVLAS VGGSMI I GGI DHSLYTGSLWYT P I RREWYYEVI I VRVEINGQDLKMDCKEYN 


296 


Db 


223 


282 


Qy 


297 


ADKAIVDSGTTLLRLPQKVFDAVV^VARASLIPEFSDGFWTGSQIACWTNSETPWSYFP 

I I : 1 1 1 1 1 1 1 1 1 1 1 : M 1 : 1 1 : : : II : 1 1 1 1 1 1 1 1 1 1 1 M : 1 1 

YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 


356 


Db 


283 


342 


Qy 


357 


KISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGF 

1 1 : 1 1 | : : : 1 1 1 1 1 1 1 1 1 1 : : 1 : : : 1 1 : 1 1 1 1 : 1 : 1 1 : 1 1 1 1 

VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 


415 


Db 


343 


402 


Qy 


416 


WIFDRAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYAL 

1 I : I I 1 1 : 1 1 : 1 1 1 1 1 : : 1 1 1 1 1 : 1 1 : : 1 : 

YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVM 


475 


Db 


403 


4 62 


Qy 


476 


MS VCGAI LLVLI VLLLLP FRCQR — RPRDPEWNDESSL 512 




Db 


463 


: : | | : : : : 1 : : : 1 1 1 1 : : : 1 1 1 

AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 





RESULT 2 
JC7573 



pepsinogen C - African clawed frog 

N;Alternate names: progastricsin 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C;Accession: JC7573; PC7118 

R;Ikuzawa, M. ; Inokuchi, T.; Kobayashi, K. ; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A; Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A; Reference number: JC7573; MUID: 21064922 ; PMID : 11134969 

A; Contents: Stomach 

A;Accession: JC7573 

A; Molecule type: mRNA 

A; Residues: 1-383 <IKU> 

A; Cross-references : DDBJ: AB04537 9 

A;Accession: PC7118 

A;Molecule type: protein 

A; Residues: 17-68 <IK2> 

C; Comment: This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics : 

A; Gene: PgC 

C; Superf amily : pepsin 

C; Keywords: stomach; zymogen 

Query Match 13.7%; Score 367.5; DB 2; Length 383; 

Best Local Similarity 28.9%; Pred. No. 9.1e-20; 

Matches 132; Conservative 70; Mismatches 154; Indels 101; Gaps 25 j 

1 MGALARALLL P L LAQWLL RAAP ELAP AP FT L P L RVAAATNRWAP T P GP GT PAERHAD GL 60 

| | | | : I : : : : I I : I I : I I II : : : 

1 MKFLILALVCLQLSEGIIR VPLKKFKSMREVMRENGI KAPLVDPAT KYYNQY 52 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

| | | | : I : : I I I I : II I I I I : I I I I I I I II 

53 ATAYEP LSNYMDM SYYGEISIGTPPQNFLVLFDTGSSNLWVAS 95 

121 TPHSYIDT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSF 173 

| | : I : : I I II I : : : I I I I I : I I I I I 

96 T YCQSQACTNHPLFNPSQSSTYSSNQQQFSLQYGTGSLTGILGYDTVTIQ 145 

174 LVNIATIFESENFFL PG IKWNGILGLAYATLAKPSSSLETFFDSLVTQANI 224 

| : | : | | || : : : I I I I I I I : : I : : I : : I I : 

146 — NVA — ISQQEFGLSETEPGTNFVYAQFDGILGLAYPSIAVGGAT — TVMQGMM-QQNL 198 

225 PN — VFSMQMCGAGLPVAGSGTNGGSLVLGGI EP SLYKGDIWYTPI KEEWYYQI EI LKLE 282 

| : | : | I IN: I I : : : I I I : : II : I I : II I 

199 LNQPIFGFYLSGQ SSQNGGEVAFGGVDQNYYTGQIYWTPVTSETYWQIGIQGFS 252 

283 IGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQL 342 

| ||: I : : I I I I : I I : I I Mil:::::: : I : 

253 INGQATGW-CSQ — GCQAIVDTGTSLLTAPQS VFSSLIQSIG AQQDQNGQYV 301 

343 ACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQLYI-QPMMGAGLNYECYRFGIS— 399 

: | : III: III: I I : I I I I I 

302 VSCSNIQN LPTISFTI SGVSFPLP — PSAYVLQQSSG YC-TIGIMPT 345 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 4 00 — PSTNA LVIGATVMEGFYVI FDRAQKRVGFAAS 431 

Ml : : I : : I : : I : I I I I : 

Db 346 YLPSQNGQPLWILGDVFLREYYSVYDLGNNQVGFATA 382 



RESULT 3 
PEMQCJ 

gastricsin (EC 3.4.23.3) precursor - Japanese macaque (fragment) 
N;Alternate names: pepsin C 

C; Species: Macaca fuscata (Japanese macaque) 

C;Date: 13-Aug-1986 (fsequence_revision 19-Oct-1995 #text_change 18-Jun-1999 
C;Accession: S19683; A00986; A22402; S16066 
R;Kageyama, T.; Tanabe, K. ; Koiwai, O. 
Eur. J. Biochem. 202, 205-215, 1991 

A; Title: Development-dependent expression of isozymogens of monkey pepsinogens 
and structural differences between them. 

A;Reference number: S19681; MUID : 92037645; PMID: 1935977 
A;Accession: S19683 
A;Molecule type: mRNA 
A; Residues: 1-377 <KAG> 

A;Cross-references: EMBL:X59754; NID:g38072; PIDN : CAA42426 . 1 ; PID:g38073 

R;Kageyama, T. ; Takahashi, K. 

J. Biol. Chem. 261, 4406-4419, 1986 

A; Title: The complete amino acid sequence of monkey progastricsin. 
A; Reference number: A00986; MUID : 86168133 ; PMID: 3514597 
A; Access ion: A0098 6 
A;Molecule type: protein 

A;Residues: 6-330, 'V , 332-349, 'VY' , 350-377 <KA2> 
R;Kageyama, T-; Takahashi, K. 
J. Biochem. 97, 1235-1246, 1985 

A; Title: Monkey pepsinogens and pepsins. VII. Analysis of the activation process 

and determination of the NH2-terminal 60-residue sequence of Japanese monkey 

progastricsin, and molecular evolution of pepsinogens. 

A;Reference number: A22402; MUID : 85289106; PMID:3928607 

A; Accession: A224 02 

A; Molecule type: protein 

A; Residues: 6-65 <KA3> 

C; Comment: This enzyme has more restricted specificity than pepsin A. 

C; Comment: The enzyme is activated in a two-step process that gives rise to two 

end products. The shorter, Ser-gastricsin, is the major product. 

C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; protein digestion; 
stomach 

F; 1-5/Domain : signal sequence (fragment) ((status predicted <SIG> 
F; 6-377 /Product : progastricsin ((status experimental <ZYM> 
F; 6-4 5/ Domain : activation peptide #status experimental <APT> 
F; 4 6- 37 7 /Product : Gly-gastricsin ((status experimental <MIN> 
F;49-377/Product: Ser-gastricsin #status experimental <MAT> 
F;31-32/Cleavage site: Phe-Leu (pepsin) ((status experimental 
F; 45-46/Cleavage site: Phe-Gly (pepsin) #status experimental 
F; 48-49/ Cleavage site: Leu-Ser (pepsin) ((status experimental 
F; 80, 265/Active site: Asp ((status predicted 

F; 93-98, 256-260, 299-332/Disulf ide bonds: ((status experimental 

Query Match 13.5%; Score 363.5; DB 1; Length 377; 

Best Local Similarity 28.9%; Pred. No. 1.8e-19; 

Matches 118; Conservative 65; Mismatches 118; Indels 107; Gaps 19; 



Qy 56 HAD GLALALEP ALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSN 115 

I I : : : I I : I : I I : I : I I I I I I : 1 I I I I I I 

Db 44 HFGDLSVSYEP MAYMD AAYFGEISIGTPPQNFLVLFDTGSSN 85 

Qy 116 FAV AGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPK 167 

| I I I I I : I I I I : I : : : I I I I I I I I : I : 

Db 86 LWVPSVYCQSQACTSHS RFNPSESSTYSTNGQTFSLQYGSGSLTGFFGYDTLTV 139 

Qy 168 GFNTSFLVNIATIFESENFFLPG IKWNGILGLAYATLAKPSSSLETFFDSLVTQA 222 

| | Ml II : : : II : I I I I M : : : I : I : 

Db 140 QSIQVPNQEFGLSEN — E PGTN FVYAQ FD GI MGLAY P T L S VD GAT — TAMQGMVQEG 192 

Qy 223 NI PN- VFSMQMCGAGLPVAGS GTNGGS LVLGGI EPS L YKGDI W YT P I KEEW YYQI EI LKL 281 

: : : | | : : I : : I I : : I I I : : I I I I I : : I : : I I : I I I = 

Db 193 ALTSPIFSVYLSDQ QGSSGGAWFGGVDSSLYTGQIYWAPVTQELYWQIGIEEF 246 

Qy 282 EIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQ 341 

||||: II : II I I : I I : I I : I I : I : : : I I I : I 

Db 247 LI GGQAS GW- C S E — GCQAI VDT GT S LLT VPQQ YMS ALLQA TGAQ 288 

Qy 342 LACWTNSETPWSYF PKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY 391 

| : | I : : : : I I I II 
Db 289 EDEYGQFLVNCNSIQNLPTLTFII NGVEFPLPPSSYI LNN 32 8 

Qy 392 ECY-RFGISP STNALVI GATVMEGFYVT FDRAQKRVGFAAS 431 

| | : | | : : I : : I : : I : Mill: 

Db 329 NGYCTVGVEPTYLSAQNSQPLWILGDVFLRSYYSVYDLSNNRVGFATA 376 



RESULT 4 
A39314 

gastricsin (EC 3.4.23.3) precursor - bullfrog 
C; Species: Rana catesbeiana (bullfrog) 

C;Date: 19-Jun-1992 #sequence_revision 19-Jun-1992 #text_change 22-Jun-1999 
C; Accession: A39314 

R;Yakabe, E.; Tanji, M. ; Ichinose, M. ; Goto, S.; Miki, K. ; Kurokawa, K.; I to, 

H.; Kageyama, T.; Takahashi, K. 

J. Biol. Chem. 266, 22436-22443, 1991 

A; Title: Purification, characterization, and amino acid sequences of pepsinogens 

and pepsins from the esophageal mucosa of bullfrog (Rana catesbeiana) . 

A; Reference number: A39314; MUID : 92042186; PMID: 1939266 

A; Accession : A3 93 14 

A; Status : preliminary 

A;Molecule type : mRNA 

A; Residues: 1-384 <YAK> 

A;Cross-references: GB:M73750; NID:g213687; PIDN : AAA49530 . 1 ; PID:g213688 
C; Super f amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion 

Query Match 13.2%; Score 355.5; DB 2; Length 384; 

Best Local Similarity 26.5%; Pred. No. 7.2e-19; 

Matches 120; Conservative 73; Mismatches 136; Indels 123; Gaps 21; 

23 E LAP AP FT L P L RVAAATN RW APT PGPGTPAERHADGLALALEP ALAS PAGAAN 76 

: | : : || : : I : II II : : : | I II II 
12 QLSEGIIKVPLKKFKSMREVMRDHGIKAPWDPAT KYYNNFATAFEP LAN 61 



Qy 

Db 



Qy 77 FLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDT Y 129 

: : I 111:111111 : I I I I I I I I I : I : 
Db 62 YMDM SYYGEISIGTPPQNFLVLFDTGSSNLWV PSTYCQSQACTNHPQ 108 

Qy 130 FDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFL- 188 

I : : I I : I I : : : I I I I I : I I I i III : I I 

Db 109 FNPSQSSSYSSNQQQFSLQYGTGSLTGILGYDTVQIQ NIA — ISQQEFGLS 157 

Qy 189 PG IKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPN — VFSMQMCGAGLP 238 

|| : : : | | | | | | | : : | : : : | : : I I : I : I : : I 
Db 158 VTEPGTNFVYAQFDGILGLAYPSIAEGGAT — TVMQGMI-QQNLINQPLFAFYLSG 210 

Qy 239 VAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNAD 298 

: III: | | : : : | | | : : M : I I : I I I : I I : I : 

Db 211 -QQNSQNGGEVAFGGVDQNYYSGQIYWTPVTSETYWQIGIQGFSVNGQATGW-CSQ — GC 266 

Qy 299 KAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQ LACWTNSET 350 

: I I I : I I : I I II II :::::: I : I : : I 

Db 267 QGIVDTGTSLLTAPQSVFSSLMQSI GAQQDQNGQYAVSCSNIQSL 311 

Qy 351 PWSYFP KISIYLRDENS SRSFRITILPQLYIQPMMGAGLNYECYRFGIS 399 

I I I I : : I I : III I I : 

Db 312 PTISFTISGVSFPLPPSAYVLQQNSGYCTIGIMPTYLPSQNGQPLW 357 

Qy 400 P STNALVI GATVMEGFYVI FDRAQKRVGFAAS 431 

: : I : : I : : I : I I I I I : 

Db 358 1 LGDVFLRQYYSVYDLGNNQVGFAAA 383 



RESULT 5 
JE0371 

pepsin C (EC 3.4.23.-) precursor - chicken 
N;Alternate names: pepsinogen C 
C; Species: Gallus gallus (chicken) 

C;Date: 23-Jul-1999 #sequence_revision 23-Jul-1999 #text_change ll-May-2000 
C;Accession: JE0371 

R; Sakamoto, N. ; Saiga, H. ; Yasugi, S. 

Biochem. Biophys . Res. Commun. 250, 420-424, 1998 

A; Title: Analysis of temporal expression pattern and cis-regulatory sequences of 
chicken pepsinogen A and C. 

A; Reference number: JE0370; MUID: 98440813; PMID: 9753645 

A/Accession: JE0371 

A; Status : preliminary 

A; Molecule type: mRNA 

A; Residues: 1-389 <SAK> 

C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase 

Query Match 13.2%; Score 355; DB 2; Length 389; 

Best Local Similarity 28.7%; Pred. No. 7.9e-19; 

Matches 114; Conservative 58; Mismatches 121; Indels 104; Gaps 16; 

75 ANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGT PHSYI 12 6 

: I I : I : I I I : I I I I I I : I I I I I I I I I I : 

56 SNFATAYEPLANNMDMSYYGEISIGTPPQNFLVLFDTGSSNLWVPSTLCQSQACANHN — 113 



Qy 

Db 



Qy 127 DTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFN ■ TS 172 

II III::: : : : I I I I I I I I I I : I : II 

Db 114 — EFDPNESSTFSTQDEFFSLQYGSGSLTGIFGFDTVTI-QGISITNQEFGLSETEPGTS 170 

Qy 173 FLVNIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPN VFSM 230 

||: : : I I I I I I : : : I : I : I I : : I I I 

Db 171 FLYS PFDGILGLAFPSI SAGGATTVMQKMLQENLLDFPVFSF 212 

Qy 231 QMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNL 290 

: I I : I I I I I I : : I : II I I : I I : : I : I I I Mill 

Db 213 YLSGQ EGSQGGELVFGGVDPNLYTGQITWTPVTQTTYWQIGIEDFAVGGQSSGW 266 

Qy 291 DCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSET 350 

I : : I I I : M : I I : I : I I : : : : : I = I = I : I I 

Db 267 -CSQ — GCQGIVDTGTSLLTVPNQVFTELMQYIG AQADD SGQYVASCSNIE- 314 

Qy 351 PWSYFPKI SIYLRDENS SRSFRITILPQLYIQPMMGAGLNYECY 394 

Ml I I : II : III M: 
Db 315 YMPTITFVISGTSFPLPPSAYMLQSNSDYCTVGIESTYLPSQTGQPLW — 362 

Qy 395 RFGI S P S TN AL VI GAT VME G F YVI FD RAQ K RVG FAAS 431 

: : I : : I I : I : I I I I : 

Db 363 ILGDVFLRVYYSIYDMGNNQVGFATA 388 



RESULT 6 
A29937 

gastricsin (EC 3.4.23.3) precursor - human 
N;Alternate names: pepsin C; pepsinogen C 
C; Species: Homo sapiens (man) 

C;Date: 17-Oct-1988 #sequence_revision 17-Oct-1988 #text_change 31~Mar-2000 
C;Accession: A29937; A31811; PX0028; 154213; A91125; A23458 
R;Hayano, T.; Sogawa, K. ; Ichihara, Y. ; Fuj ii-Kuriyama, Y. ; Takahashi, K. 
J. Biol. Chem. 263, 1382-1385, 1988 

A; Title: Primary structure of human pepsinogen C gene. 
A; Reference number: A29937; MUID: 88087276; PMID:3335549 
A; Accession: A29937 
A;Molecule type: DNA 
A; Residues: 1-388 <HAY> 

R;Taggart, R.T.; Cass, L.G.; Mohandas, T.K.; Derby, P.; Barr, P.J.; Pals, G. ; 
Bell, G.I. 

J. Biol. Chem. 264, 375-379, 1989 

A; Title: Human pepsinogen C (progastricsin) . Isolation of cDNA clones, 
localization to chromosome 6, and sequence homology with pepsinogen A. 
A;Reference number: A31811; MUID: 89079679; PMID:2909526 
A;Accession: A31811 
A; Molecule type: mRNA 
A; Residues: 1-388 <TAG> 

A;Cross-references: GB:J04443; NID:g551175; PIDN :AAA60074 . 1 ; PID:g551176 
R;Athauda, S.B.P.; Tanji, M. ; Kageyama, T.; Takahashi, K. 
J. Biochem. 106, 920-927, 1989 

A; Title: A comparative study on the NH2-terminal amino acid sequences and some 

other properties of six isozymic forms of human pepsinogens and pepsins. 

A; Reference number: PX0023; MUID: 90130402; PMID:2515193 

A; Accession: PX0028 

A;Molecule type: protein 

A; Residues: 17-101 <ATH> 



R;Pals, G . ; Azuma, T.; Mohandas, T.K.; Bell, G.I.; Bacon, J.; Samloff, I.M.; 
Walz, D.A.; Barr, P.J.; Taggart, R.T. 
Genomics 4, 137-148, 1989 

A; Title: Human pepsinogen C (progastricsin) polymorphism: evidence for a single 
locus located at 6p21.1-pter. 

A;Reference number: 154213; KUID : 89290840 ; PMID:2567697 
A;Accession: 154213 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-388 <RES> 

A;Cross-references: GB:M23077; NID: gl89830; PIDN : AAA60063 . 1 ; PID:g387015; 
GB:J03063 

A; Note: parts of this sequence, including the amino end and carboxyl ends of the 

mature protein, were determined by protein sequencing 

R;Foltmann, B.; Jensen, A.L. 

Eur. J. Biochem. 128, 63-70, 1982 

A; Title: Human progastricsin. Analysis of intermediates during activation into 

gastricsin and determination of the amino acid sequence of the propart. 

A;Reference number: A91125; MUID : 83079318 ; PMID: 6816595 

A;Accession: A91125 

A; Molecule type: protein 

A;Residues: 17-39, ' ED ' , 42-51, ' S 1 , 53-64 <FOL> 
A;Note: pro-form; 29-Leu was also found 

A;Note: activation at pH 2 is proposed to involve conformation change, cleavage 

after Phe-42, and cleavage after Leu-59 

C; Genetics : 

A; Gene: GDB : PGC 

A;Cross-references: GDB: 119485; OMIM: 169740 
A;Map position: 6p21 . 3-6p21 . 1 

A;Introns: 20/2; 70/3; 110/1; 149/3; 216/2; 256/2; 305/3; 338/3 
C; Super family : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion; stomach; zymogen 
F; 1-1 6/ Domain : signal sequence #status predicted <SIG> 
F; 17-59/Domain: propeptide jfstatus experimental <PRO> 
F; 60-38 8/Product : gastricsin jfstatus experimental <MAT> 

Query Match 13.1%; Score 353; DB 2; Length 388; 

Best Local Similarity 29.1%; Pred. No. l.le-18; 

Matches 120; Conservative 65; Mismatches 120; Indels 108; Gaps 21; 



Qy 



Db 



52 PAERHADG-LALA1EPALASPAGAANFLAMVDNLQGDSG 110 

||:: ||:: I I : I : I I : I : I I I I I I : I I 

50 PAWKYRFGDLSVTYEP MAYMD AAYFGEI S I GT P PQNFLVLFD 91 



Qy 



Db 



111 TGSSNFAV AGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDL 162 

Mill I I I I I I : I I I I : I : : : I I I I I I I I 

92 TGSSNLWVPSVYCQSQACTSHS RFNPSESSTYSTNGQTFSLQYGSGSLTGFFGYDT 147 



Qy 



Db 



163 VTIPKGFNTSFLVNIATIFESENFFLPG IKWNGILGLAYATLAKPSSSLETFFDS 217 

: | : I I III II : : : | | : | | | | I : : : I 

14 8 LTV QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMGLAYPALSVDEAT — TAMQG 198 



Qy 



Db 



218 LVTQANIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQI 276 

: | : : : I I I : : I : : I I : : I I I : : I I I I I : : I : : I I : I I 

199 MVQEGALTSPVFSVYLSNQ QGS SGGAWFGGVDS SLYTGQI YWAPVTQELYWQI 252 



Qy 



277 EILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGF 336 



Db 



295 



Qy 337 WTGSQLACWTNSETPWSYF PKISIYLRDENSSRSFRITILPQLYIQPMMG 38 6 

I I : I I : I I : : = : I I I 
Db 296 -TGAQ EDEYGQFLVNCNSIQNLPSLTFII NGVEFPLPPSSYI 336 

Qy 387 AGLNYECY- RFGI S P STNA LVI GATVMEGFYVI FDRAQKRVGFAAS 431 

| : I I : I || : : I : : I : : I Mill: 

Db 337 --LSNNGYCTVGVEPTYLSSQNGQPLWILGDVFLRSYYSVYDLGNNRVGFATA 387 



RESULT 7 
JC7246 

pepsinogen C - common marmoset 

C; Species: Callithrix jacchus (common marmoset) 

C;Date: 09-Jun-2000 #sequence_revision 09-Jun-2000 #text_change 21-Jul-2000 
C;Accession: JC7246 
R;Kageyama, T. 

J. Biochem. 127, 761-770, 2000 

A;Title: New world monkey pepsinogens A and C, and prochymosins . Purification, 

characterization of enzymatic properties, cDNA cloning, and molecular evolution. 

A; Reference number: JC7245 

A;Accession: JC7246 

A; Molecule type: mRNA 

A; Residues: 1-388 <KAG> 

A; Cross-references : DDBJ: AB038385 

A; Experimental source: strain NW791 

C; Comment: This protein, a zymogen of pepsins, is the major proteolytic enzyme 
in vertebrate gastric juices. It plays roles in gastric digestion, and is a 
useful molecular marker for clarifying the evolution of mammalian orders and 
families . 

C; Superf amily : pepsin 

C; Keywords: gastric juice; zymogen 

Query Match 13.1%; Score 351.5; DB 2; Length 388; 

Best Local Similarity 30.1%; Pred. No. 1.4e-18; 



Matches 


112; Conservative 56; Mismatches 115; Indels 89; Gaps 


17; 


Qy 


92 


YYLEMLI GT P PQKLQ I LVDT G S SN FAV AGTPHSYIDTYFDTERSSTYRSKGF 

1 : 1 : 1 1 1 1 1 1 : 1 1 1 M 1 1 1 MM 1 : 1 1 II 1 1 

YFGEISIGTPPQNFLVLFDTGSSNLWVPSVYCQSQACTSHS RFNPSASSTYSSNGQ 


143 


Db 


73 


128 


Qy 


144 


DVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPG IKWNGILG 

: : : 1 11 III 1 1 : 1 : 1 1 III II : : : | I : I 

TFSLQYGSGSLTGFFGYDTLTV QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMG 


198 


Db 


129 


181 


Qy 


199 


LAYATLAKPSSSLETFFDSLVTQANIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPS 

Ml 1 : : : 1 : : : : : 1 1 1 : | : : | | : : : 1 1 : : 1 

LAYPALSMGGAT — TAMQGMLQEGALTSPVFSFYLSNQ QGS S GGAVI FGGVDS S 


257 


Db 


182 


233 


Qy 


258 


LYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFD 

I I I I : : 1 : : 1 1 : II 1 : II 1 1 : II : I I I I : 1 1 : 1 1 : M : 

L YT GQ I YWAP VTQELYWQ I G I EE FL I GGQAS GW- C S E — GCQAI VDTGT S LLTVPQQYMS 


317 


Db 


234 


290 


Qy 


318 


AWEAVARAS LI PEFS DGFWT GSQLACWTNS ET PWSYF PKISI YLRDENS 


367 



Db 



291 AFLEA' 



TGAQ EDEYGQFLVNCDSIQNLPTLTFII 323 



Qy 368 SRS FRITI LPQLYIQPMMGAGLNYECY- RFGI S P STNALVTGATVMEGFYVT F 419 

: I I I I : I I : I I : : I : : I : I 

Db 324 -NGVEFPLPPSSYI LSNNGYCTVGVEPTYLS SQNSQPLWI LGDVFLRS YYSVF 375 

Qy 420 D RAQKRVG FAAS 431 

I Mill: 
Db 376 DLGNNRVGFATA 387 



RESULT 8 
B43356 

gastricsin (EC 3.4.23.3) precursor - guinea pig 

N;Alternate names: pepsin C 

C;Species: Cavia porcellus (guinea pig) 

C;Date: 03-Feb-1994 #sequence_revision 03-Feb-1994 #text_change 22-Jun-1999 
C;Accession: B43356 

R;Kageyama, T. ; Ichinose, M. ; Tsukada, S.; Miki, K . ; Kurokawa, K. ; Koiwai, O.; 
Tanji, M. ; Yakabe, E.; Athauda, S.B.; Takahashi, K. 
J. Biol. Chem. 267, 16450-16459, 1992 

A;Title: Gastric procathepsin E and progastricsin from guinea pig. Purification, 
molecular cloning of cDNAs, and characterization of enzymatic properties, with 
special reference to procathepsin E. 

A;Reference number: A43356; MUID : 92355614 ; PMID: 1644829 
A;Accession: B43356 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-394 <KAG> 

A;Cross-references: GB:M88652; NID:gl91296; PIDN : AAA37 053 . 1 ; PID:gl91297 
A;Note: sequence extracted f rom NCBI backbone (NCBIN: 110805, NCBIP : 110806) 
C; Super family : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; protein digestion; 
stomach 

Query Match 12.1%; Score 324.5; DB 2; Length 394; 

Best Local Similarity 29.0%; Pred. No. 1.5e-16; 

Matches 107; Conservative 63; Mismatches 116; Indels 83; Gaps 18; 

92 YYLEMLIGTPPQKLQILVDTGSSNF AVAGTPHSYIDTYFDTERSSTYRSKGF 143 

: : : : I I I M I : I I I I I I I : : I I I I I : MM: 

79 YFGQISLGTPPQSFQVLFDTGSSNLWVPSVYCSSLACTTH TRFNPRDSSTYVATDQ 134 

)VTVKYTQGSWTGFVGEDLVTI PK-GFNTSFLVNIATIFESENFFLPG IK 192 

: : : I II I I I I : I I I I I I hi II = 

iFSLEYGTGSLTGVFGYDTMTIODIOVPKOEFGLS ETE PGSDFVYAE 181 



Qy 


92 


Db 


79 


Qy 


144 


Db 


135 


Qy 


193 


Db 


182 


Qy 


250 


Db 


232 


Qy 


310 



:: I I I I I I | : : : : I I : : : : : I I : : II I : : I I 

FDGILGLGYPGLSEGGAT — TAMQGLLREGALSQS LFSVYL GSQQGSDEGQL 231 

250 VLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN7\JDKAI VX)SGTTLL 309 

: | M : : I I I I M : : I I : : I I : I I I I I : I : I I I : I I : I I 

232 ILGGVDESLYTGDIYWTPVTQELYWQIGIEGFLIDGSASGWCSR GCQGIVDTGTSLL 288 

310 RLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSR 369 
: | : I : I : I : : I : : | : : | I : 



Db 289 TVPSDYLSTLVQAIGAEE — NEYGEYF VSCSSIQDLPTLTFVISGV 332 



Qy 370 SFRITILPQLYIQP MMGAGLNYECYRFGI S PSTN — ALVI GATVMEGFYVT FDRA 422 

: I I I I : I I : I I : : I : : I : : I I 

Db 333 — EFPLSPSAYILSGENYCMVGLESTY VSPGGGEPVWILGDVFLRSYYSVYDLA 384 

Qy 423 QKRVGFAAS 4 31 

Mill: 

Db 385 NNRVGFATA 393 



RESULT 9 
JC7575 

pepsinogen A - bullfrog 

C; Species: Rana catesbeiana (bullfrog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C;Accession: JC7575 

R;Ikuzawa, M. ; Inokuchi, T.; Kobayashi, K. ; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A;Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A; Reference number: JC7573; MUID : 21064922 ; PMID: 11134969 

A; Contents: Stomach 

A;Accession: JC7575 

A;Molecule type: mRNA 

A;Residues: 1-385 <IKU> 

A;Cross-ref erences : DDBJ: AB045376 

C;Comment: This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics : 

A; Gene : PgA 

C; Superf amily : pepsin 

C; Keywords: stomach; zymogen 

Query Match 11.9%; Score 320; DB 2; Length 385; 

Best Local Similarity 27.8%; Pred. No. 3.2e-16; 

Conservative 67; Mismatches 147; Indels 74; Gaps 15; 

50 GTPAERHAI)GLALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILV 109 

I : : I I I : I I : I : I I I :: I I I I I I : : 

GDYLKKHHYNPATKYFPSLAQASG E P LQN YMD I E Y FGT I S I GT P PQ S FT VI F 90 

110 DTG S SN FAVAGT PH S Y I DT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDL 162 

I I I I M I I I : | : : : | | | : : : | : : : | I I : I I : I I 

91 DTGSSNLWV PSVYCSSPACTNHHMFNPQQSSTFQATNTPVSIQYGTGSMSGFLGYDT 147 

rriPKGFNTSFLWIATIFESE-NFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQ 221 
: I I : : I I II : : I I I I I I : : I I II II- I 



222 ANIP-NVFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILK 2 80 

I I : : I I : : : I : I : : | | : : | | | : : : I : I I : I I : 

203 GLIPQDLFSVYL S S QGQ S GS FVL FGGVDT S Y YT GN LNWVP LT AET YWQ I T VD S 255 

281 LEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGS 340 

INI: : M II : M : II I I : : I : : I : I : 

256 ISIGGQVIACS GSCSAIVDTGTSLLAGPSTPI-ANIQYYIGAN QDSNGQYV — 305 



Matches 


ii; 


Qy 


50 


Db 


39 


Qy 


110 


Db 


91 


Qy 


163 


Db 


148 


Qy 


222 


Db 


203 


Qy 


281 


Db 


256 



Qy 341 QLACWTNSETPWSYFP KISIYLRDENSS — RSFRITILPQLYIQPMMGAGLN 390 

: I I I I I I : I I I : I I 

Db 306 -INCNNISNMPTWFTINGVQYPLPASAWRQSQQSCTSGFQAMNLP 351 

Qy 391 YECYRFGI S PSTNALVIGATVMEGFYVI FDRAQKRVGFA 42 9 

: I : : : I : : I I : I I I I I I 

Db 352 TSSGDLWILGDVFIREYYWFDRANNYVAMA 382 



RESULT 10 
REMSK 

renin (EC 3.4.23.15) precursor, renal - mouse 

N;Alternate names: angiotensin- f orming enzyme; angiotensinogenase; renin 1 
C; Species: Mus musculus (house mouse) 

C;Date: 30-Jun-1987 #sequence_revision 30-Jun-1987 #text_change 18-Jun-1999 
C;Accession: A00989; S07636; A22766; A22058; 157576; A05137; JH0083 
R;Holm, I.; Olio, R. ; Panthier, J.J.; Rougeon, F. 
EMBO J. 3, 557-562, 1984 

A;Title: Evolution of aspartyl proteases by gene duplication: the mouse renin 

gene is organized in two homologous clusters of four exons . 

A; Reference number: A00989; MUID: 84182525 ; PMID: 6370686 

A; Accession: A00989 

A; Molecule type: DNA 

A; Residues: 1-402 <HOL> 

A; Cross-references : EMBL:X00850 

R;Kim, W.S.; Murakami, K. ; Nakayama, K. 

Nucleic Acids Res. 17, 9480, 1989 

A; Title: Nucleotide sequence of a cDNA coding for mouse Renl preprorenin. 
A;Reference number: S07636; MUID: 90067953 ; PMID:2685761 
A;Accession: S07636 
A;Molecule type: mRNA 
A; Residues: 1-4 02 <KIM> 

A; Cross-references: EMBL:X16642; NID:g53930; PIDN : CAA34636 . 1 ; PID:g53931 
R;Mullins, J. J.; Burt, D.W.; Windass, J.D.; McTurk, P.; George, H.; Brammar, 
W.J. 

EMBO J. 1, 1461-1466, 1982 

A; Title: Molecular cloning of two distinct renin genes from the DBA/2 mouse. 
A; Reference number: A90968; MUID: 84207899; PMID: 6327270 
A; Accession: A22766 
A; Molecule type: mRNA 
A;Residues: 269-314 , 1 D ' , 316 <MUL> 

R; Panthier, J.J.; Dreyfus, M. ; Roux, D.T.L.; Rougeon, F. 
Proc. Natl. Acad. Sci. U.S.A. 81, 5489-5493, 1984 

A;Title: Mouse kidney and submaxillary gland renin genes differ in their 5 f 
putative regulatory sequences. 

A; Reference number: A22058; MUID : 84298161 ; PMID: 6089205 
A; Accession: A22058 
A; Molecule type: DNA 
A; Residues: 1-30 <PAN> 

R; Field, L.J.; Philbrick, W.M. ; Howies, P.N.; Dickinson, D.P.; McGowan, R.A. ; 
Gross, K.W. 

Mol. Cell. Biol. 4, 2321-2331, 1984 

A;Title: Expression of tissue-specific Ren-1 and Ren-2 genes of mice: 
Comparative analysis of 5 1 -proximal flanking regions. 
A; Reference number: 157576; MUID : 85085936 ; PMID: 6392850 
A;Accession: 157576 



A; Status : preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-31 <RES> 

A; Cross-references: GB:K02800; NID:g200689; PIDN: AAA40044 . 1; PID:g200690 

C; Comment: The only known function of renal renin is to release angiotensin I 

from angiotensinogen in the plasma, initiating a cascade of reactions that 

produces an elevation of blood pressure and increased sodium retention by the 

kidney. 

C;Comment: Renal renin is synthesized by the juxtaglomerular cells of the kidney 
in response to decreased blood pressure and sodium concentration. 
C; Genetics: 
A; Gene: Ren-1 

A;Introns: 31/2; 81/3; 123/1; 162/3; 228/2; 268/2; 316/3; 349/3 
C; Superfamily : pepsin 

C;Keywords: aspartic proteinase; blood pressure control; glycoprotein; 
hydrolase; kidney; plasma 

F; 1-21 /Domain : signal sequence #status predicted <SIG> 
F; 22- 64 /Domain: propeptide #status predicted <PRO> 
F; 65-402/Product : renin (fstatus predicted <MAT> 

F; 69, 139, 320/Binding site: carbohydrate (Asn) (covalent) #status predicted 
F; 102 , 287/Active site: Asp #status predicted 

Query Match 11.9%; Score 320; DB 1; Length 402; 

Best Local Similarity 28.6%; Pred. No. 3.4e-16; 

Matches 126; Conservative 66; Mismatches 181; Indels 68; Gaps 21; 

Qy 10 LPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPG-PGTPAERHADGLALALE 65 

: I I I IE : I I : I I I I : I III I : I 

Db 6 MPLWALLLL WSPCTFSLPTRTATFERIPLKKMPSVRE1LEERGVDMTRLSAEWGV 60 

Qy 66 PA LAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GT PPQKLQI LVDTGS SNFAV 118 

I : Ml I : I II II I : I I I II I : : : I II I : I I 

Db 61 FTKRPSLTNLTSPWLTNYL NTQ — YYGEIGIGTPPQTFKVTFDTGSANLWV 110 

Qy 119 AGTPHSY IDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTS 172 

| | I : : : : I I : I I I I: I I I I : : I I I : I : 

Db 111 PSTKCSRLYLACGIHSLYESSDSSSYMENGSDFTIHYGSGRVKGFLSQDSVTV-GGITVT 169 

Qy 173 FLVNIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PNVFSMQ 231 

I || | : : | : | | : : I : : I I : : : I : ill: 

Db 170 QTFGEVTELPLIPFML— AKFDGVLGMGFP— AQAVGGVTPVFDHILSQGVLKEEVFSVY 225 

Qy 232 MCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLD 291 

II I I : II I I : I I : I : I I : : I I : : : I I I 
Db 226 Y NRGSHLLGGEWLGGSDPQHYQGNFHYVSISKTDSWQITMKGVSVG — SSTLL 277 

Qy 292 CREYNADKAIVDSGTTLLRLPQKVFDAWE^V-ARASLIPEFSDGFWTGSQLACWTNSET 350 

|| | : | | : | : : : I : : : I : I : I I : : I I : 
Db 278 CEEGCA — VWDTGS S FI SAPT S S LKLIMQALGAKEKRI EEY WNC SQV 324 

Qy 351 PWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGL-NYECYRFGISPSTNAL-VIG 408 

I I I I I I : : : : : I I : I I I : I : I 

Db 325 P — TLPDISFDL GGRAYTLSSTDYVLQYPNRRDKLCTLALHAMDIPPPTGPVWVLG 378 

Qy 409 ATVMEGFYVI FDRAQKRVGFA 429 

II: II II I hill 
Db 37 9 ATFIRKFYTEFDRHNNRIGFA 399 



RESULT 11 
S66516 

oryzasin (EC 3.4.23.-) precursor - rice 
N;Alternate names: aspartic proteinase 1 
C; Species: Oryza sativa (rice) 

C;Date: 28-Oct-1996 #sequence_revision 13-Mar-1997 #text_change 20-Jun-2000 
C;Accession: S66516; S66517 

R;Asakura, T.; Watanabe, H. ; Abe, K. ; Arai, S. 
Eur. J. Biochem. 232, 77-83, 1995 

A; Title: Rice aspartic proteinase, oryzasin, expressed during seed ripening and 
germination, has a gene organization distinct from those of animal and microbial 
aspartic proteinases. 

A; Reference number: S66516; MUID: 96048031; PMID: 7556174 
A;Accession: S66516 
A; Molecule type: DNA 
A; Residues: 1-509 <ASA> 

A;Cross-references: EMBL:D32165; NID:g511665; PIDN : BAA06876 . 1; PID:gl030715 
A; Accession : S 6 65 17 
A; Molecule type: mRNA 
A; Residues: 1-509 <ASZ> 

A;Cross-references: EMBL:D32144; NID : gl255684 ; PIDN : BAA06875 . 1 ; PID:gl711289 
C; Comment: The pair of saposin repeat homology domains tagged SAP1 and SAP2 
represent a cyclical permutation of a single saposin repeat. 
C * Genetics : 

A;Introns: 119/3; 140/1; 171/3; 209/2; 265/3; 279/1; 300/3; 338/3; 360/2; 412/3; 
452/3; 482/2 

C; Superf amily : oryzasin; saposin repeat homology 

C; Keywords: aspartic proteinase; hydrolase 

F; 1-20 /Domain : signal sequence #status predicted <SIG> 

F;21-68/Domain: propeptide ffstatus predicted <PRO> 

F; 68 -5 09/ Product : aspartic proteinase 1 #status predicted <MAT> 

F; 316-361/Domain: saposin repeat homology #status atypical <SAP1> 

F; 370-420/Domain: saposin repeat homology ffstatus atypical <SAP2> 

F; 103, 290/Active site: Asp ftstatus predicted 

Query Match 11.7%; Score 313.5; DB 2; Length 509; 

Best Local Similarity 23.0%; Pred. No. 1.5e-15; 

Matches 127; Conservative 75; Mismatches 179; Indels 171; Gaps 19; 

ALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRV^ 62 

: : I I I : I I I I I : I II Mil I I I I I 

SVALVLLAAVLLQALLPASAEEGLVRIALKKRPIDENSRVAARLSG EEGARRLGL 59 

ALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSN 115 

: | I : : | : : : I : I : : I I i I I I :: I I I I II 

RGANSLGGGGGEGDIVALKNYMNAQ YFGEIGVGTPPQKFTVI FDTGSSNLWVPSAK 115 

116 — FAVAGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSF 173 

| : : I || : : : | | | I : I : : I II II Mil: 

116 CYFSIACFFHS RYKSGQSSTYQKNGKPAAIQYGTGSIAGFFSEDSVTVGD 165 

174 LVNIATIFESENFF LPGI KWNGILGLAYATLAKPSSSLETFFDSLVTQANI 224 

: : : I I I : I : : I I II I : : : : : 

166 LWKDQEFIEATKEPGLTFMVAKFDGILGLGFQEISVGDA V 206 



Qy 


3 


Db 


5 


Qy 


63 


Db 


60 


Qy 


116 


Db 


116 


Qy 


174 


Db 


166 



Qy 225 PNVFSMQMCG-AGLPVAGSGTN GGSLVLGGIEPSLYKGDIWYTPIKEEWYYQI 276 

| : | | II I I I : I I I : : I I I I I : I I : ' ' I : I 

Db 2 07 PVWYKMVEQGLVSEPVFSFWFNRHSDEGEGGEIVFGGMDPSHYKGNHTYVPVSQKGYWQF 266 

Qy 277 EILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPE 331 

| : : I I I : : I : I I I I I I : I I I : • I : : : : 

D b 267 EMGDVLIGGKTTGF-CA— SGCSAIADSGTSLLAGPTAIITEINEKIGATGWSQECKTV 323 

Qy 332 FSDGF— — 336 

I : I 

Db 324 VSQYGQQILDLLLAETQPSKICSQVGLCTFDGKHGVSAGIKSWDDEAGESNGLQSGPMC 383 

Qy 337 WTGSQLACWTNSETPWSY FPKISIYLRD 364 

I : I I I : : I I : I I : 

Db 384 NACEMAVVWMQNQLAQNKTQDLILNYINQLCDKLPSPMGESSVDCGSLASMPEISFTIGA 443 

Qy 365 ENSSRSFRITILPQLYIQPMMGAGLNYECY RFGISPSTNAL-VIGATVMEGFYVIF 419 

: : : | : | | : I I :| II I ::| I :: :| 

Db 444 K KFALKPEEYIL-KVGEGAAAQCISGFTAMDIPPPRGPLWILGDVFMGAYHTVF 496 

Qy 420 DRAQKRVGFAAS 431 

I : I I I I I I 
Db 497 DYGKMRVGFAKS 508 



RESULT 12 
A24608 

gastricsin (EC 3.4.23.3) precursor - rat 
N; Alternate names: pepsinogen C 
N;Contains: pepsin A (EC 3.4.23.1) precursor 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 30-Jun-1988 #sequence_revision 05-Aug-1994 #text_change 18-Jun-1999 
C;Accession: A33510; A24608; C22434; A05145; A61298 

R;Ishihara, T. ; Ichihara, Y. ; Hayano, T . ; Katsura, I.; Sogawa, K . ; Fujii- 

Kuriyama, Y. ; Takahashi, K. 

J. Biol. Chem. 264, 10193-10199, 1989 

A;Title: Primary structure and transcriptional regulation of rat pepsinogen C 
gene . 

A;Reference number: A33510; MUID : 8 9255508 ; PMID : 2722863 

A;Accession: A33510 

A; Molecule type: DNA 

A; Residues: 1-392 <ISH> 

A; Cross-ref erences : GB:M25985 

R; Ichihara, Y. ; Sogawa, K. ; Morohashi, K.; Fujii-Kuriyama, Y. ; Takahashi, K. 
Eur. J. Biochem. 161, 7-12, 1986 

A; Title: Nucleotide sequence of a nearly full-length cDNA coding for pepsinogen 
of rat gastric mucosa. 

A/Reference number: A24608; MUID: 87054020; PMID:3780741 
A; Accession: A24608 
A; Molecule type: mRNA 
A; Residues: 1-392 <ICH> 

A;Cross-references: GB:X04644; NID:g56880; PIDN : CAA28305 . 1 ; PID:g56881 
R;Ichihara, Y. ; Sogawa, K. ; Takahashi, K. 
J. Biochem. 98, 483-492, 1985 

A;Title: Isolation of human, swine, and rat prepepsinogens and calf 
preprochymosin, and determination of the primary structures of their NH2- 
terminal signal sequences. 



A; Reference number: A22434; MUID: 86059312 ; PMID:2415509 
A;Accession: C22434 
A;Molecule type: protein 

A;Residues: 1-19, * X ' , 21-23, ' X ■ , 25-29 <IC2> 

R;Arai, K.M. ; Muto, N. ; Tani, S.; Akahane, K. 

Biochim. Biophys . Acta 788, 256-261, 1984 

A; Title: The N-terminal sequence of rat pepsinogen. 

A; Reference number: A05145; MUID : 84257697 ; PMID:6743670 

A/Accession: A05145 

A;Molecule type: protein 

A;Residues: 17-30, 1 Q 32-102, ' A 1 , 104-108, 1 L' , 110-112 <ARA> 
A; Experimental source: Wistar strain 
R;Ichihara, Y. ; Sogawa, K.; Takahashi, K. 
J. Biochem. 92, 603-606, 1982 

A;Title: Rat gastric prepepsinogen : in vitro synthesis and partial amino- 
terminal signal sequence. 

A; Reference number: A61298; MUID: 83030750 ; PMID: 6182139 
A;Accession: A61298 
A;Molecule type: protein 

A; Residues: 1, 'XX', 4-6, , X , ,8-9, 'X',11, 'X\ 13-14, 'XXX', 18- 
19, 'X' ,21, 'X' ,23, 'XX 1 ,2 6, 'X 1 <IC3> 

C;Comment: This enzyme has more restricted specificity than pepsin A. It is the 
major form of pepsinogen in rat gastric mucosa. 
C; Genetics : 

A;Introns: 20/2; 73/3; 113/1; 152/3; 219/2; 259/2; 309/3; 342/3 

A; Note: there are at least two very similar genes for gastricsin in rat 

C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; protein digestion; 
stomach 

F; 1-1 6/ Domain : signal sequence ftstatus experimental <SIG> 
F; 17 -3 92 /Product : pepsinogen #status experimental <MAT> 
F; 17- 62 /Domain : activation peptide #status experimental <ACT> 
F; 94, 280/Active site: Asp #status predicted 



F;107- 


112,270- 


-275, 314-347/Disulf ide bonds: ftstatus predicted 




Query Match 11.6%; Score 313; DB 1; Length 392; 
Best Local Similarity 29.5%; Pred. No. l.le-15; 

Matches 105; Conservative 56; Mismatches 139; Indels 56; Gaps 


16; 


Qy 


92 


YYLEMLIGTPPQKLQILVDTGSSNFAV AGTPHSYIDTYFDTERSSTYRSKGF 

1 : 1 : 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 III: 1 : : 1 1 1 1 : : 1 

YFGEISIGTPPQNFLVLFDTGSSNLWVSSVYCQSEACTTHA RFNPSKSSTYYTEGQ 


143 


Db 


76 


131 


Qy 


144 


DVTVKYTQGSWTGFVGEDLVTI PKGFNTSFLVNI AT I FESENFFLPG IKWNGILG 

: : : 1 1 1 1 1 1 1 1 : 1 : 1 1 III II : : : 1 1 : 1 

TFSLQYGTGSLTGFFGYDTLTV QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMG 


198 


Db 


132 


184 


Qy 


199 


LAYAT LAK PSSSLETFFDS LVT QAN I PN VFSMQMC GAGL P VAG S — GTNGGSLVLGGIEP 
Ml I : I 1 : : : 1 : 1 1 II 1 : 1 1 1 : 1 1 1 : : 
LAYPGLS— SGGATTALQGMLGE GALSQPLFGVYL GSQQGSNGGQIVFGGVDK 


256 


Db 


185 


235 


Qy 


257 


S LYKGDIWYTPI KEEWYYQI EI LKLEI GGQSLNLDCREYN7VDKAI VDSGTTLLRLPQKVF 

: 1 1' 1 : 1 : 1 : : 1 1 : 1 1 1 III: 1 : 1 1 1 : 1 1 : 1 1 : 1 : 
NLYTGEITWVPVTQELYWQITIDDFLIGDQASGW-CSSQGC-QGIVDTGTSLLVMPAQYL 


316 


Db 


236 


293 


Qy 


317 


DAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 


376 



Db 294 SELLQTIGAQE — GEYGEYF VSCDSVSS LPTLSFVL NGVQFPLS 335 

Qy 377 PQLY- IQPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FDRAQKRVGFAAS 431 

I I I I : I : : I : : I I I I : I I I I 

Db 336 P S S YI I QEDN FCMVGLES I S LT S ES GQPLWI LGDVFLRS Y YAI FDMGNNKVGLAT S 391 



RESULT 13 
A41443 

pepsin (EC 3.4.23.-) precursor, embryonic - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 05-Jun-1992 #sequence_revision 05-Jun-1992 #text_change 21-Jul-2000 
C; Accession: A41443 

R;Hayashi, K.; Agata, K. ; Mochii, M. ; Yasugi, S.; Eguchi, G.; Mizuno, T. 
J. Biochem. 103, 290-296, 1988 

A; Title: Molecular cloning and the nucleotide sequence of cDNA for embryonic 

chicken pepsinogen: phylogenetic relationship with prochymosin. 

A; Reference number: A41443; MUID: 88227903; PMID: 3131317 

A;Accession: A41443 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 1-383 <HAY> 

A;Cross-references: GB:D00215; NID: g2760810 ; PIDN : BAA00153 . 1 ; PID:g222853 
C; Super family : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion 

Query Match 11.5%; Score 310; DB 2; Length 383; 

Best Local Similarity 26.8%; Pred. No. 1.8e-15; 

Matches 106; Conservative 63; Mismatches 136; Indels 90; Gaps 15; 

Qy 56 HA — DGLALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGS 113 

1111:111 III I I : I I I I I I :: I I I I 

Db 55 HAFPDVLTWTEPLL NTLDM EYYGTI SI GTPPQDFTWFDTGS 97 

Qy 114 SNFAVAG TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGF 169 

II I : I I : : I I I I : I I : : : : I I I I I I I I : 

Db 98 SNLWVPSVSCTSPACQSHQMFNPSQSSTYKSTGQNLSIHYGTGDMEGTVGCDTVTVASLM 157 

Qy 170 NTSFLWIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PNVF 228 

: I : I : : I I II: : I : : I I I I I I : I I : : I I : : I : : : I : I 

Db 158 DTNQLFGLST-SEPGQFFV-YVKFDGILGLGYPSLA — ADGI T PVFDNMVNES LLEQNLF 213 

Qy 229 SMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSL 288 

I : : : I : I I I I : I : I I : I : : I : I I : : : I : 

Db 214 SVYLSREPM GSMWFGGI DES YFTGS INWI PVS YQGYWQI SMDSI I VNKQEI 265 

Qy 289 NLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNS 348 

: : I I : I : I I : I : I : : I I I 

Db 266 ACS SGCQAIIDTGTSLVAGPASDINDIQSAVG ANQ 300 

Qy 349 ETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY ECY 394 

II I | : | : : : | | : | I 

Db 301 NTYGEY S VN C S H I LAMP D WF — VI G- GI Q YP VP ALAYT EQNGQGT CM 345 



Qy 395 RFGIS PSTNALVI GATVMEGFYVI FDRAQKRVGFA 429 

: I : : : I : : I I I I I I I I I I 

Db 346 S S FQNS SADLWI LGDVFI RVYYS I FDRANNRVGLA 38 0 



RESULT 14 
KHHUD 

cathepsin D (EC 3.4.23.5) precursor [validated] - human 
N;Alternate names: preprocathepsin D 
C; Species: Homo sapiens (man) 

C;Date: 28-Dec-1987 #sequence_revision 28-Dec-1987 #text_change 15-Sep-2000 

C;Accession: A25771; S30749; PC2066; 159236; 157716 

R; Faust, P.L.; Kornfeld, S.; Chirgwin, J.M. 

Proc. Natl. Acad. Sci. U.S.A. 82, 4910-4914, 1985 

A; Title: Cloning and sequence analysis of cDNA for human cathepsin D. 
A;Reference number: A25771; MUID : 85270436; PMID:3927292 
A;Accession: A25771 
A; Molecule type: mRNA 
A; Residues: 1-412 <FAU> 

A; Cross-references: EMBL:M11233; NID:gl81179; PIDN : AAB59529 . 1 ; PID:gl81180 

R;Westley, B.R.; May, F.E.B. 

Nucleic Acids Res. 15, 3773-3786, 1987 

A; Title: Oestrogen regulates cathepsin D mRNA levels in oestrogen responsive 
human breast cancer cells. 

A;Reference number: S30749; MUID : 87231068 ; PMID:3588310 
A; Accession: S3 074 9 
A; Molecule type: mRNA 
A; Residues: 1-412 <WES> 

A;Cross-references: EMBL : X05344 ; NID:g29677; PIDN : CAA2 8955 . 1 ; PID:g29678 
R;May, F.E.B. ; Smith, D.J.; Westley, B.R. 
Gene 134, 277-282, 1993 

A; Title: The human cathepsin D-encoding gene is transcribed from an estrogen- 
regulated and a constitutive start point. 
A; Reference number: PC2066; MUID : 94085791 ; PMID: 8262386 
A;Accession: PC2066 
A;Molecule type: DNA 
A; Residues: 1-23 <MAY> 

A; Cross-references: GB:L12980; NID:g291930; PIDN: AAA1 6314 . 1; PID:g455429 
A; Experimental source: MCF-7 cell 
R;Cavailles, V.; Augereau, P.; Rochefort, H. 
Proc. Natl. Acad. Sci. U.S.A. 90, 203-207, 1993 

A; Title: Cathepsin D gene is controlled by a mixed promoter, and estrogens 
stimulate only TATA- dependent transcription in breast cancer cells. 
A;Reference number: 159236; MUID: 93126342 ; PMID:8419924 
A;Accession: 159236 

A; Status : translation not shown; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-22 <CAV1> 

A; Cross-references : GB:S52557; NID:g263124; PIDN: AAD13868 . 1; PID:g4261568 
R;Augereau, P.; Miralles, F. ; Cavailles, V.; Gaudelet, C; Parker, M. ; 
Rochefort, H. 

Mol. Endocrinol. 8, 693-703, 1994 

A; Title: Characterization of the proximal estrogen-responsive element of human 
cathepsin D gene. 

A; Reference number: 157716; MUID: 95021301; PMID: 7935485 
A;Accession: 157716 

A; Status: translation not shown; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-22 <CAV2> 

A; Cross-references: GB:S74689; NID:g786350; PIDN : AAD14156 . 1; PID:g4261856 



R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Erickson, J.W. 
submitted to the Brookhaven Protein Data Bank, April 1993 
A; Reference number: A51839; PDB : 1LYA 

A;Contents: annotation; X-ray crystallography, 2.5 angstroms, residues 65- 
161;170-241 

R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Erickson, J.W. 
submitted to the Brookhaven Protein Data Bank, April 1993 
A; Reference number: A51840; PDB : 1LYB 

A; Contents: annotation; X-ray crystallography, 2.5 angstroms, with inhibitor 
residues 65-161; 170-241 

R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Hosur, M.V. ; Sowder II, R.C.; Cachau, 
R.E.; Collins, J.; Silva, A.M.; Erickson, J.W. 
Proc. Natl. Acad. Sci. U.S.A. 90, 6796-6800, 1993 

A;Title: Crystal structures of native and inhibited forms of human cathepsin D: 

implications for lysosomal targeting and drug design. 

A;Reference number: A48229; MUID : 93342076; PMID: 8393577 

A;Contents: annotation; X-ray crystallography, 2.5 angstroms 

C'Comment: Cathepsin D is a ubiquitous lysosomal proteinase. 

C; Comment: In addition to the propeptide, residues 163-168 and 411-412 are 

proteolytically removed. Residues 169 and 170 are also partially removed. 

C; Comment: The carbohydrate bound to 134-Asn contains a mannose-6-phosphate that 

is bound near 267-Lys and the phosphotransferase recognition region. 

C ; Genetics : 

A; Gene: GDB : CTSD 

A;Cross-references : GDB: 120512; OMIM: 116840 
A;Map position: llpl5 . 5-llpl5 . 5 
C; Function: 

A; Description: limited specificity endopeptidase 
A; Pathway: intracellular protein degradation 
C; Superfamily : pepsin 

C;Keywords: aspartic proteinase; glycoprotein; hydrolase; lysosome; protein 
degradation 

F; 1-20/Domain: signal sequence (tstatus predicted <SIG> 
F;21-64/Domain: propeptide tstatus predicted <PRO> 
F;65-162, 169-410/Product: cathepsin D #status experimental <MAT> 
F;267, 329-356/Region: phosphotransferase recognition 

F;91-160, 110-117, 286-290, 329-366/Disulf ide bonds: #status experimental 
F; 97 , 295/Active site: Asp (tstatus experimental 

F; 134, 263/Binding site: carbohydrate (Asn) (covalent) #status experimental 

Query Match 11.5%; Score 308.5; DB 1; Length 412; 

Best Local Similarity 27.1%; Pred. No. 2.5e-15; 

Matches 121; Conservative 75; Mismatches 180; Indels 71; Gaps 22; 

Qy 9 L L P L LAQW L L RAAP E LAP AP FT L P L RVAAAT N R WAP T P G PGTPAERHADGLAL 62 

I I I I I I I I I I : I I : I : : I | : : : : 

Db 6 LLPLAL — CLLAAP — ASALVRI PLHKFTSIRRTMSEVGGSVEDLIAKGPVSKYSQAVPA 61 

Qy 63 ALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTP 122 

I : I I :: I I |: I I I I I I :: I I I I I I I 

Db 62 VTEGPI — PEVLKNYM DAQYYGEIGIGTPPQCFTWFDTGSSNLWVPSIH 109 

Qy 123 HSYIDT YFDTERS STYRS KGFDVTVKYTQGSWTGFVGEDLVTI P — KGFNTSFL 174 

: I : : : : : | | | | I : I I I : I : : : I I : : I : I I 

Db 110 CKLLDIACWIHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASAL 169 

Qy 175 — VNIATIFESENFFLPGI KWNGI LGLAYATLAKP S S S LET FFDS LVTQANI - PN 226 



I : | III I : : I I I I : I I : : : : : I I : I : I : I 

Db 170 GGVKVERQVFGEATKQPGITFIAAKFDGILGMAYPRIS — VNNVL PVFDNLMQQKLVDQN 227 

Qy 227 VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQ 286 

: I I I 111:111: I I I : I : : I : I : : : : I : 

Db 228 IFSFY LSRDPDAQPGGELMLGGTDSKYYKGSLSYLNVTRKAYWQVHLDQVEV-AS 281 

Qy 287 SLNLDCREYNADKAIVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWT 346 

I I I : I : | | I I : I I : I : I III : I : : I 

Db 282 GLTL-CKE — GCEAIVDTGTSLMVGP VDEVRELQKAIGAVPLIQGEY MIPC — 329 

Qy 347 NSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRF GISPSTN 403 

I I I : : I : : : : : I : I : I I I I I : 

Db 330 EKVSTLPAITLKL GGKGYKLS — PEDYTLKVSQAGKTLCLSGFMGMDIPPPSG 380 

Qy 4 04 AL- VI GATVMEGFYVT FDRAQKRVGFA 429 

I : : I : : I : I I I I I I I I 

Db 381 P LW I LG D VF I G R Y YT VFDRDNN RVG FA 407 



RESULT 15 
KHMSD 

cathepsin D (EC 3.4.23.5) precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 31-Dec-1991 #sequence_revision 31-Dec-1991 #text_change 18-Jun-1999 
C;Accession: 148278; S14704; S12587 

R;Hetman, M. ; Perschl, A.; Saftig, P.; Von Figura, K. ; Peters, C. 
DNA Cell Biol. 13, 419-427, 1994 

A;Title: Mouse cathepsin D gene: molecular organization, characterization of the 
promoter, and chromosomal localization. 

A; Reference number: 148278; MUID : 94280622 ; PMID: 8011168 
A; Accession: 14 827 8 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-410 <RES> 

A;Cross-references: EMBL:X68378; NID:g50302; PIDN : CAA4 8453 . 1 ; PID:g817945 
R;Diedrich, J.F.; Staskus, K.A. ; Retzel, E.F.; Haase, A.T. 
Nucleic Acids Res. 18, 7184, 1990 

A; Title: Nucleotide sequence of a cDNA encoding mouse cathepsin D. 
A; Reference number: S14704; MUID: 91088345; PMID: 2263503 
A;Accession: S14704 
A; Molecule type: mRNA 
A;Residues: 1-410 <DIE> 

A/Cross-references: EMBL:X53337; NID:g50300; PIDN : CAA37423 . 1 ; PID:g50301 
R;Grusby, M.J.; Mitchell, S.C.; Glimcher, L.H. 
Nucleic Acids Res. 18, 4008, 1990 

A;Title: Molecular cloning of mouse cathepsin D. 
A;Reference number: S12587; MUID: 90326544; PMID:2374732 
A;Accession: S12587 
A; Molecule type: mRNA 
A; Residues: 1-410 <GRU> 

A;Cross-references: EMBL:X52886; NID:g50298; PIDN : CAA37 067 . 1 ; PID:g50299 
C;Genetics: 

A;Introns: 23/2; 76/3; 118/1; 157/3; 233/2; 274/2; 322/3; 355/3 
C; Function: 

A; Description: limited specificity endopeptidase 
A; Pathway: intracellular protein degradation 



C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; glycoprotein; hydrolase; lysosome; protein 
degradation 

F; 1-20 /Domain: signal sequence #status predicted <SIG> 
F; 2 1-64 /Domain : propeptide #status predicted <PRO> 

F; 65-4 10/ Product : cathepsin D, single-chain form tfstatus predicted <MAT> 
F; 91-160, 110-117, 284-288, 327-364/Disulfide bonds: #status predicted 
F; 97, 293/Active site: Asp #status predicted 

F; 134, 261/Binding site: carbohydrate (Asn) (covalent) #status predicted 



Query Match 11.4%; Score 306.5; DB 1; Length 410; 

Best Local Similarity 27.5%; Pred. No. 3.6e-15; 

Matches 103; Conservative 64; Mismatches 123; Indels 85; Gaps 



15; 



Qy 

Db 



92 YYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDT YFDTERSSTYRSKGFDV 145 

I I :: I I I I I I :: I I I I I I I : I : : : : : | | | | I 

79 YYGDIGIGTPPQCFTWFDTGSSNLWVPSIHCKILDIACWVHHKYNSDKSSTYVKNGTSF 138 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



146 TVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT IFESENFFLPGI 



: I | | : I : : : I I : : 



: I 



I I 



KWNGIL 197 
I : : I I I 



139 DIHYGSGSLSGYLSQDTVSVPCKSDQSKARGIKVEKQIF-GEATKQPGIVFVAAKFDGIL 197 

198 GLAYAT LAK PSSSLETFFDS LVT QAN I - PNVF SMQMCGAGL P VAG S GTN GG S LVLGG I E P 256 

I I I I I : I I I : 

LNRDPEGQPGGELMLGGTDS 250 



I : I : : 



: I : I : I : I I 



198 GMGYPHIS — VNNVLPVFDNLMQQKLVDKNIFSFY 

257 SLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVF 316 



I I : : I : : I : I : : : I I : I : 



I I : 



: I I I I : I I : I I I : 



251 KYYHGELSYLNVTRKAYWQVHMDQLEVGNE-LTL-CK — GGCEAIVDTGTSLLVGPVEEV 306 
317 DAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 376 



• • I • 

• • i • 



1 1 



: I 



• • • I 
... i 



307 KELQKAIGAVPLI- 



•QGEYMIPCEKVSSL 333 



377 PQLYIQPMMGAGLNYEC YRFGIS- 



P S TNALVI GAT VME G 414 



: I I III 



: I 



I I 



: : I 



334 PTVYLK — LG-GKNYELHPDKYILKVSQGGKTICLSGFMGMDIPPPSGPLWILGDVFIGS 390 



Qy 



Db 



415 FYVI FDRAQKRVGFA 429 

: I : I I I I I I I I 
391 YYTVFDRDNNRVGFA 405 



Search completed: March 4, 2004, 15:40:59 
Job time : 29.1043 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: March 4, 2004, 15:30:05 ; 



Search time 28.1043 Seconds 

(without alignments) 

1772.942 Million cell updates/sec 



Title: US-09-668-314C-2 
Perfect score: 2687 

Sequence: 1 MGALARALLLPLLAQWLLRA, 



RPRDPEWNDESSLVRHRWK 518 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : 



283366 



PIR_78:* 
1 : pirl : * 
2: pir2:* 
3: pir3:* 
4: pir4:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


1178.5 


43.9 


501 


2 


A59090 


aspartic proteinas 


2 


367.5 


13.7 


383 


2 


JC7573 


pepsinogen C - Afr 


3 


363.5 


13.5 


377 


1 


PEMQCJ 


gastricsin (EC 3.4 


4 


355.5 


13.2 


384 


2 


A39314 


gastricsin (EC 3.4 


5 


355 


13.2 


389 


2 


JE0371 


pepsin C (EC 3.4.2 


6 


353 


13.1 


388 


2 


A29937 


gastricsin (EC 3.4 


7 


351.5 


13.1 


388 


2 


JC7246 


pepsinogen C - com 


8 


324.5 


12.1 


394 


2 


B43356 


gastricsin (EC 3.4 


9 


320 


11.9 
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2 
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10 


320 
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1 
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11 
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2 
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12 
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1 


A24608 
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13 
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11.5 
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2 


A41443 
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22 
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1 
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2 
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23 
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1 
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2 
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24 
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1 
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2 
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25 
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1 
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1 
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27 
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0 
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2 
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28 
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0 
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1 
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29 
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9 
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2 
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30 
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8 
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1 
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31 
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8 
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1 
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32 
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8 
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1 


REHUK 
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33 
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8 
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2 
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34 
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10. 


8 
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E38302 
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35 
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7 
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2 


B38302 
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36 
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10. 


7 
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37 
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7 
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41 
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6 


382 


1 


PECH 


pepsin A (EC 3.4.2 


42 
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6 
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1 
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43 
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44 


285 


10. 


6 
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2 


T47207 


aspartic proteinas 


45 
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6 
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1 


PEPG 


pepsin A (EC 3.4.2 



ALIGNMENTS 



RESULT 1 
A59090 

aspartic proteinase (EC 3.4.23.-) BACE precursor - human 
N;Alternate names: beta-secretase; beta-site APP cleaving enzyme 
C; Species: Homo sapiens (man) 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change ll-May-2000 
C;Accession: A59090 

R;Vassar, R. ; Bennett, B.D.; Babu-Khan, S.; Kahn, S.; Mendiaz, E.A. ; Denis, P.; 
Teplow, D.B.; Ross, S.; Amarante, P.; Loeloff, R. ; Luo, Y. ; Fisher, S.; Fuller, 
J.; Edenson, S.; Lile, J.; Jarosinski, M.A. ; Biere, A.L.; Curran, E.; Burgess, 
T.; Louis, J.C.; Collins, F. ; Treanor, J.; Rogers, G. ; Citron, M. 
Science 286, 735-741, 1999 

A; Title: beta-Secretase cleavage of Alzheimer's amyloid precursor protein by the 
transmembrane aspartic protease BACE. 

A; Reference number: A59090; MUID: 20002972 ; PMID: 10531052 
A; Note: submitted to GenBank, September 1999 
A; Accession: A59090 

A; Status: not compared with conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-501 <VAS> 



A;Cross-references: GB:AF190725; NID : g6118538 ; PIDN: AAF04142 . 1 ; PID:g6118539 
C; Genetics : 
A; Gene: BACE 

C; Superf amily : beta-secretase 

C;Keywords: Alzheimer's disease; aspartic proteinase; brain; glycoprotein; 

hydrolase; protein digestion; transmembrane protein; zymogen 

F; 1-21 /Domain : signal sequence ((status predicted <SIG> 

F;22-45/Domain: propeptide #status predicted <PRO> 

F;46-501/Product: acid proteinase BACE ((status predicted <MAT> 

F; 461-477/Domain: transmembrane ((status predicted <TRN> 

F; 93, 289/Active site: Asp ((status predicted 

F; 153, 172, 223, 354/Binding site: carbohydrate (Asn) (covalent) ((status predicted 
F; 330-380/Disulfide bonds: ((status predicted 

Query Match 43.9%; Score 1178.5; DB 2; Length 501; 

Best Local Similarity 46.2%; Pred. No. 3.8e-80; 

Matches 24 0; Conservative 82; Mismatches 164; Indels 33; Gaps 9; 

Qy 7 ALLLPLLAQWLLRAAPELAPAPFT L P LRVAAATNRWAP T P G P GT P AE RHAD GLA 61 

I I I I I : : I I I I I I I II M 
Db 2 AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL — GLR 42 

Qy 62 LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTPPQKLQI L VX)TGS SNFAVA 119 

I I I : I : I I I I I : I I I : I II : I I : I : I I I I I I I I I I I I I I I I 

Db 43 LPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVG 102 

Qy 120 GTPHSYIDTYFDTERSSTYRSKGFDWVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 179 

||:: I : : I I I I I I I I I I I I I : I I I I : I I I I : I I I 

Db 103 AAPHPFLHRYYQRQLSSTYRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

Qy 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

I ||: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I : I : I I I I I : 
Db 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPL 222 

Qy 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 296 

I : | | | : : : | | | : III I : I I I I I : MM:: | : : : | | li | : I I : I I I 

Db 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

Qy 297 ABKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFP 356 

I I : I I I I I I I I I I I : I I I : I I : : : II : I I M I I I I II I I I : I I 

Db 283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

Qy 357 KI SI YLRDENS SRS FRITI LPQLYIQPMMGAGLNY-ECYRFGI S PSTNALVI GATVMEGF 415 

I I : I I I : : : I I I I I I I II I : : I : : : I I : I I I I : I : I I : I I I I 

Db 343 VI SLYLMGEWNQS FRITI LPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 402 

Qy 416 WIFDRAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYAL 475 

I I : I I I I : I I : I I I I I : : I I I I I : I I : : I : 

Db 403 YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM 462 

Qy 476 MSVCGAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

: : I I : : : : I : : : I I I I : : : I I I 
Db 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 



RESULT 2 
JC7573 



pepsinogen C - African clawed frog 

N;Alternate names: progas tricsin 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C;Accession: JC7573; PC7118 

R;Ikuzawa, M. ; Inokuchi, T . ; Kobayashi, K.; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A;Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A; Reference number: JC7573; MUID : 21064922 ; PMID : 11134969 

A; Contents: Stomach 

A;Accession: JC7573 

A;Molecule type: mRNA 

A; Residues: 1-383 <IKU> 

A; Cross-references : DDBJ:AB04537 9 

A; Accession: PC7118 

A;Molecule type: protein 

A; Residues: 17-68 <IK2> 

C;Comment: This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics: 

A; Gene: PgC 

C; Superf amily : pepsin 

C; Keywords: stomach; zymogen 

Query Match 13.7%; Score 367.5; DB 2; Length 383; 

Best Local Similarity 28.9%; Pred. No. 9.1e-20; 

Matches 132; Conservative 70; Mismatches 154; Indels 101; Gaps 25; 

1 MGALARALL L P L LAQWL L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GL 60 

I ||: I : : : : I I : I I : I I II : : : 
1 MKFLILALVCLQLSEGIIR VPLKKFKSMREVMRENGIKAPLVDPAT KYYNQY 52 

ALALEPALAS PAGA7^FLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I M : I : : I I I i : I I I I I I : I II I I I I II 

ATAYEP LSNYMDM SYYGEISIGTPPQNFLVLFDTGSSNLWVAS 95 

121 TPHSYIDT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSF 173 

I : I : : I I I I I : : : I I I I I : I I I I I 

96 T YCQSQACTNHPLFNPSQSSTYSSNQQQFSLQYGTGSLTGILGYDTVTIQ 145 

174 LVNIATIFESENFFL PG IKWNGILGLAYATLAKPSSSLETFFDSLVTQANI 224 

I : I : I I II : : : I I I I I I I : : I : : I : : I I : 

146 — NVA — ISQQEFGLSETEPGTNFVYAQFDGILGLAYPSIAVGGAT — TVMQGMM-QQNL 198 

225 PN — VFSMQMCGAGLPVAGS GTNGGS LVLGGI EP SL YKGDI WYT P I KEEWYYQI EI LKLE 2 82 

I : I : I I III: I I : : : I I I : : I I : I I : I I I 

199 LNQPI FGFYLSGQ S SQNGGEVAFGGVDQN YYTGQI YWT PVT S ET YWQI GI QGFS 252 

283 IGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQL 342 

III: I : : | | | | : | | : | | II II :::::: : I : 

253 INGQATGW-CSQ — GCQAIVDTGTSLLTAPQSVFSSLIQSIG AQQDQNGQYV 301 

343 ACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQLYI-QPMMGAGLNYECYRFGIS — 399 

: I : III: III: I I : I I I I I 

302 VSCSNIQN LPTISFTI SGVSFPLP — PSAYVLQQSSG YC-TIGIMPT 345 



Qy 


1 
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61 
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53 


Qy 
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Qy 
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Qy 
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Db 
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Qy 
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Db 
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Qy 


343 


Db 


302 



Qy 4 00 — PSTNA LVI GATVMEGFYVI FDRAQKRVGFAAS 431 

III : : I : : I : : I : I I I I : 

Db 34 6 YLPSQNGQPLWILGDVFLREYYSVYDLGNNQVGFATA 382 



RESULT 3 
PEMQCJ 

gastricsin (EC 3.4.23.3) precursor - Japanese macaque (fragment) 
N;Alternate names : pepsin C 

C; Species: Macaca fuscata (Japanese macaque) 

C;Date: 13-Aug-1986 #sequence_revision 19-Oct-1995 #text_change 18-Jun-1999 
C;Accession: S19683; A00986; A22402; S16066 
R;Kageyama, T . ; Tanabe, K. ; Koiwai, O. 
Eur. J. Biochem. 202, 205-215, 1991 

A; Title: Development-dependent expression of isozymogens of monkey pepsinogens 
and structural differences between them. 

A; Reference number: S19681; MUID : 92037645; PMID: 1935977 
A; Accession : SI 9 683 
A; Molecule type : mRNA 
A; Residues: 1-377 <KAG> 

A; Cross-references: EMBL:X59754; NID:g38072; PIDN : CAA42426 . 1 ; PID:g38073 

R;Kageyama, T.; Takahashi, K. 

J. Biol. Chem. 261, 4406-4419, 1986 

A; Title: The complete amino acid sequence of monkey progastricsin . 

A; Reference number: A00986; MUID : 86168133 ; PMID: 3514597 

A; Accession: AO 098 6 

A; Molecule type : protein 

A;Residues: 6-330, 'V , 332-349, 'VY 1 , 350-377 <KA2> 
R;Kageyama, T . ; Takahashi, K. 
J. Biochem. 97, 1235-1246, 1985 

A; Title: Monkey pepsinogens and pepsins. VII. Analysis of the activation process 

and determination of the NH2-terminal 60-residue sequence of Japanese monkey 

progastricsin, and molecular evolution of pepsinogens. 

A;Reference number: A22402; MUID: 85289106; PMID:3928607 

A; Accession: A22 4 02 

A;Molecule type : protein 

A; Residues: 6-65 <KA3> 

C; Comment: This enzyme has more restricted specificity than pepsin A. 

C; Comment: The enzyme is activated in a two-step process that gives rise to two 

end products. The shorter, Ser-gastricsin, is the major product. 

C; Superfamily : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; protein digestion; 
stomach 

F; 1-5/Domain: signal sequence (fragment) #status predicted <SIG> 
F; 6-37 7/ Product : progastricsin #status experimental <ZYM> 
F; 6-4 5 /Domain : activation peptide #status experimental <APT> 
F; 4 6-377/Product : Gly-gastricsin #status experimental <MIN> 
F; 4 9- 37 7 /Product : Ser-gastricsin ftstatus experimental <MAT> 
F; 31-32/Cleavage site: Phe-Leu (pepsin) #status experimental 
F; 4 5- 4 6/ Cleavage site: Phe-Gly (pepsin) #status experimental 
F; 48-49/Cleavage site: Leu-Ser (pepsin) ((status experimental 
F; 80, 265/Active site: Asp #status predicted 

F; 93-98, 256-260, 299-332/Disulf ide bonds: #status experimental 



Query Match 13.5%; Score 363.5; DB 1; 

Best Local Similarity 28.9%; Pred. No. 1.8e-19; 
Matches 118 ; Conservative 65 ; Mismatches 118 ; 



Length 377; 
Indels 107; Gaps 



19; 



Qy 56 HAD GLALALEP ALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTP PQKLQI LVDTGS SN 115 

I I : : : I I : I : I I : I : I I I I I I MINIM 

Db 44 HFGDLSVSYEP MAYMD AAYFGEISIGTPPQNFLVLFDTGSSN 85 

Qy 116 FAV AGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPK 167 

I I I I I I : I II I : I : : : I I I I I I I I : I : 

Db 86 LWVPSVYCQSQACTSHS RFNPSESSTYSTNGQTFSLQYGSGSLTGFFGYDTLTV — 139 

Qy 168 GFNTSFLVNIATIFESENFFLPG IKWNGILGLAYATLAKPSSSLETFFDSLVTQA 222 

I I III II : : : I I : I I I I I I : : : I : I : 

Db 140 QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMGLAYPTLSVDGAT — TAMQGMVQEG 192 

Qy 223 NIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKL 281 

: : : I I : : I : : I I : : I I I : : I II I I : : I : : I I : I I I : 

Db 193 ALTSPIFSVYLSDQ QGSSGGAWFGGVDSSLYTGQIYWAPVTQELYWQIGIEEF 246 

Qy 282 EIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQ 341 

MM: II M II I M I M I M I : I : : M MM 

Db 24 7 LIGGQASGW-CSE — GCQAIVDTGTSLLTVPQQYMSALLQA TGAQ 2 88 

Qy 342 LACWTNSETPWSYF PKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY 391 

I : I I : : : : I I I II 

Db 289 EDEYGQFLVNCNSIQNLPTLTFII NGVEFPLPPSSYI LNN 328 

Qy 392 ECY-RFGISP STNALVI GATVMEGFYVI FDRAQKRVGFAAS 431 

I I : I I : M : M : M : Mill: 

Db 32 9 NGYCTVGVEPTYLSAQNSQPLWILGDVFLRSYYSVYDLSNNRVGFATA 376 



RESULT 4 
A39314 

gastricsin (EC 3.4.23.3) precursor - bullfrog 
C; Species: Rana catesbeiana (bullfrog) 

C;Date: 19-Jun-1992 #sequence_revision 19-Jun-1992 #text_change 22-Jun-1999 
C;Accession: A39314 

R;Yakabe, E. ; Tanji, M. ; Ichinose, M. ; Goto, S.; Miki, K. ; Kurokawa, K. ; Ito, 

H.; Kageyama, T.; Takahashi, K. 

J. Biol. Chem. 266, 22436-22443, 1991 

A; Title: Purification, characterization, and amino acid sequences of pepsinogens 

and pepsins from the esophageal mucosa of bullfrog (Rana catesbeiana) . 

A; Reference number: A39314; MUID : 92 042186 ; PMID: 1939266 

A; Accession: A3 9314 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 1-384 <YAK> 

A; Cross-references : GB:M73750; NID:g213687; PIDN : AAA4 9530 . 1 ; PID:g213688 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion 

Query Match 13.2%; Score 355.5; DB 2; Length 384; 

Best Local Similarity 26.5%; Pred. No. 7.2e-19; 

Matches 120; Conservative 73; Mismatches 136; Indels 123; Gaps 21; 

Qy 23 E LAP AP FT L P L RVAAATN RW APT PGPGTPAERHAD GLALALEP ALAS PAGAAN 76 

:| : MM : M II II : : : I I II II 

Db 12 QLSEGIIKVPLKKFKSMREVMRDHGIKAPWDPAT KYYNNFATAFEP LAN 61 



Qy 77 FLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDT Y 129 

:: I llhllllll MINIM I I M : 
Db 62 YMDM SYYGEISIGTPPQNFLVLFDTGSSNLWV PSTYCQSQACTNHPQ 108 

Qy 130 FDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFL- 188 

I : M I M I : : M II II M I I I III : I I 

Db 109 FNPSQSSSYSSNQQQFSLQYGTGSLTGILGYDTVQIQ NIA— ISQQEFGLS 157 

Qy 189 PG IKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPN — VFSMQMCGAGLP 238 

II : : M I II II I : M : : : I : : I I : I M : : I 
Db 158 VT E P GTN FVYAQ FD G I L GLAY P S I AEGGAT — TVMQGMI-QQNLINQPLFAFYLSG 210 

Qy 239 VAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNAD 298 

: Ml: I I : : : I I I : M I : I I M I I : M : I : 

Db 211 -QQNSQNGGEVAFGGVDQNYYSGQIYWT PVTSETYWQIGIQGFSVNGQATGW-CSQ — GC 266 

Qy 299 KAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQ LACWTNSET 350 

: II I : I I M I MM:::::: | : | : : I 

Db 2 67 QGIVDTGTSLLTAPQSVFSSLMQSI GAQQDQN GQ YAVS C S N I Q S L 311 

Qy 351 PWSYFP KISIYLRDENS SRSFRITILPQLYIQPMMGAGLNYECYRFGIS 399 

I I I I : : I I : III I I : 

Db 312 PTISFTISGVSFPLPPSAYVLQQNSGYCTIGIMPTYLPSQNGQPLW 357 

Qy 400 P STNALVI GATVMEGFYVI FDRAQKRVGFAAS 431 

: : I : : I : : I : I I I I I : 

Db 358 ILGDVFLRQYYSVYDLGNNQVGFAAA 383 



RESULT 5 
JE0371 

pepsin C (EC 3.4.23.-) precursor - chicken 
N; Alternate names : pepsinogen C 
C; Species: Gallus gallus (chicken) 

C;Date: 23-Jul-1999 #sequence_revision 23-Jul-1999 #text_change ll-May-2000 
C; Accession : JE0371 

R;Sakamoto, N. ; Saiga, H . ; Yasugi, S. 

Biochem. Biophys . Res. Coitimun. 250, 420-424, 1998 

A; Title: Analysis of temporal expression pattern and cis-regulatory sequences of 
chicken pepsinogen A and C. 

A; Reference number: JE0370; MUID: 98440813; PMID: 9753645 

A; Accession : JE0371 

A; Status : preliminary 

A;Molecule type : mRNA 

A; Residues: 1-389 <SAK> 

C; Super family: pepsin 

C; Keywords: aspartic proteinase; hydrolase 

Query Match 13.2%; Score 355; DB 2; Length 389; 

Best Local Similarity 28.7%; Pred. No. 7.9e-19; 

Matches 114; Conservative 58; Mismatches 121; Indels 104; Gaps 16; 

Qy 75 AN FLAMVDNLQGDS GRGYYLEMLI GT P PQKLQI LVDTGS SN FAVAGT PHSYI 126 

:|| : I : I I I : I II II I MINIM I I I : 

Db 56 SNFATAYEPLANNMDMSYYGEISIGTPPQNFLVLFDTGSSNLWVPSTLCQSQACANHN — 113 



Qy 127 DTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFN TS 172 

II III::: : : : I MM I I I I I : I : II 

Db 114 --EFDPNESSTFSTQDEFFSLQYGSGSLTGIFGFDTVTI-QGISITNQEFGLSETEPGTS 170 

Qy 173 FLVNIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPN — VFSM 230 

II: : : II I M I : : : I : I : I I : : I I I 

Db 171 FLYS PFDGILGLAFPSI SAGGATTVMQKMLQENLLDFPVFSF 212 

Qy 231 QMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNL 290 

: I I : II I I II : : I : M I I : M : : Mill Mill 

Db 213 YLSGQ EGSQGGELVFGGVDPNLYTGQITWTPVTQTTYWQIGIEDFAVGGQSSGW 266 

Qy 291 DCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSET 350 

I : : | | | : | | : | | : | : | | : : : : : I : I M Ml 
Db 2 67 -CSQ — GCQGIVDTGTSLLTVPNQVFTELMQYIG AQADD SGQYVASCSNIE- 314 

Qy 351 PWSYFPKI SIYLRDENS SRSFRITILPQLYIQPMMGAGLNYECY 394 

III I I : I I : III II: 

Db 315 YMPTITFVISGTSFPLPPSAYMLQSNSDYCTVGIESTYLPSQTGQPLW 362 

Qy 395 R FG I S P S TNALVI GAT VME G F YVI FD RAQ K RVG FAAS 431 

: : I : M I M Mill: 

Db 363 ILGDVFLRVYYSIYDMGNNQVGFATA 388 



RESULT 6 
A29937 

gastricsin (EC 3.4.23.3) precursor - human 
N;Alternate names: pepsin C; pepsinogen C 
C; Species : Homo sapiens (man) 

C;Date: 17-Oct-1988 #sequence_revision 17-Oct-1988 #text_change 31-Mar-2000 
C;Accession: A29937; A31811; PX0028; 154213; A91125; A23458 
R;Hayano, T. ; Sogawa, K. ; Ichihara, Y. ; Fu j ii-Kuriyama, Y.; Takahashi, K. 
J. Biol. Chem. 263, 1382-1385, 1988 

A; Title: Primary structure of human pepsinogen C gene. 
A;Reference number: A29937; MUID: 88087276; PMID:3335549 
A; Accession: A29937 
A; Molecule type: DNA 
A; Residues: 1-388 <HAY> 

R;Taggart, R.T.; Cass, L.G.; Mohandas, T.K.; Derby, P.; Barr, P.J.; Pals, G. ; 
Bell, G.I. 

J. Biol. Chem. 264, 375-379, 1989 

A;Title: Human pepsinogen C (progas tricsin) . Isolation of cDNA clones, 
localization to chromosome 6, and sequence homology with pepsinogen A. 
A; Reference number: A31811; MUID: 89079679 ; PMID:2909526 
A; Accession : A3 18 11 
A; Molecule type : mRNA 
A; Residues: 1-388 <TAG> 

A;Cross-references: GB:J04443; NID:g551175; PIDN : AAA60074 . 1 ; PID:g551176 
R;Athauda, S.B.P.; Tanji, M. ; Kageyama, T.; Takahashi, K. 
J. Biochem. 106, 920-927, 1989 

A; Title: A comparative study on the NH2-terminal amino acid sequences and some 

other properties of six isozymic forms of human pepsinogens and pepsins. 

A; Reference number: PX0023; MUID: 90130402 ; PMID:2515193 

A; Accession: PX0028 

A; Molecule type : protein 

A; Residues: 17-101 <ATH> 



R;Pals, G.; Azuma, T.; Mohandas, T.K.; Bell, G.I.; Bacon, J.; Samloff, I.M.; 
Walz, D.A.; Barr, P.J.; Taggart, R.T. 
Genomics 4, 137-148, 1989 

A; Title: Human pepsinogen C (progas tricsin) polymorphism: evidence for a single 
locus located at 6p21.1-pter. 

A; Reference number: 154213; MUID : 89290840 ; PMID:2567697 
A;Accession: 154213 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-388 <RES> 

A; Cross-references: GB:M23077; NID:gl89830; PIDN : AAA60063 . 1 ; PID:g387015; 
GB: J03063 

A; Note: parts of this sequence, including the amino end and carboxyl ends of the 

mature protein, were determined by protein sequencing 

R;Foltmann, B.; Jensen, A.L. 

Eur. J. Biochem. 128, 63-70, 1982 

A; Title: Human progastricsin. Analysis of intermediates during activation into 
gastricsin and determination of the amino acid sequence of the propart. 
A; Reference number: A91125; MUID : 83079318 ; PMID: 6816595 
A;Accession: A91125 
A;Molecule type: protein 

A;Residues: 17-39, 1 ED 1 , 42-51, 1 S 1 , 53-64 <FOL> 
A;Note: pro-form; 29-Leu was also found 

A;Note: activation at pH 2 is proposed to involve conformation change, cleavage 

after Phe-42, and cleavage after Leu-59 

C; Genetics : 

A; Gene: GDB : PGC 

A; Cross-references: GDB: 119485; OMIM: 169740 
A;Map position: 6p21 . 3-6p21 . 1 

A;Introns: 20/2; 70/3; 110/1; 149/3; 216/2; 256/2; 305/3; 338/3 
C; Super family : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion; stomach; zymogen 
F; 1-16/Domain: signal sequence #status predicted <SIG> 
F; 17-59/Domain: propeptide ((status experimental <PRO> 
F; 60-388/Product : gastricsin ((status experimental <MAT> 

Query Match 13.1%; Score 353; DB 2; Length 388; 

Best Local Similarity 29.1%; Pred. No. l.le-18; 

Matches 120; Conservative 65; Mismatches 120; Indels 108; Gaps 21; 

Qy 52 PAERHADG-LALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVD 110 

I I : : I I : : I I : I : I I : I : I I I I I I : I I 

Db 50 PAWKYRFGDLSVTYEP MAYMD AAYFGEISIGTPPQNFLVLFD 91 

Qy 111 TGSSNFAV AGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDL 162 

I I I I I I I I I I I : III : : : : I I I I I I I I 

Db 92 TGSSNLWVPSVYCQSQACTSHS RFNPSESSTYSTNGQTFSLQYGSGSLTGFFGYDT 147 

Qy 163 VTIPKGFNTSFLVNIATIFESENFFLPG IKWNGILGLAYATLAKPSSSLETFFDS 217 

: I : I I III II : : : | | : | | | | | : : : I 

Db 148 LTV QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMGLAYPALSVDEAT — TAMQG 198 

Qy 218 LVTQANIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQI 276 

: | : : : | | | : : I : : I I : : I I I : : I I I I I : : I : : I I : I I 

Db 199 MVQEGALTSPVFSVYLSNQ QGSSGGAWFGGVDSSLYTGQIYWAPVTQELYWQI 252 



Qy 



277 EILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGF 336 



I : MM: II : | | I I : I I : I I : I I : I : : : I 

Db 253 GIEEFLIGGQASGW-CSE — GCQAI VDTGT S LLTVPQQ YMS AL LQA 2 95 

Qy 337 WTGSQLACWTNSETPWSYF PKISI YLRDENSSRSFRITILPQLYIQPMMG 386 

I I : I I : I I : : : : I I I 

Db 296 -TGAQ EDEYGQFLVNCNSIQNLPSLTFII NGVEFPLPPSSYI 336 

Qy 387 AGLNYECY-RFGISP STNA LVIGATVMEGFYVI FDRAQKRVGFAAS 431 

I : I I : I II : : I : : I : : I Mill: 

Db 337 — LSNNGYCTVGVEPTYLSSQNGQPLWILGDVFLRSYYSVYDLGNNRVGFATA 387 



RESULT 7 
JC7246 

pepsinogen C - common marmoset 

C; Species: Callithrix jacchus (common marmoset) 

C;Date: 09-Jun-2000 #sequence__revision 09-Jun-2000 #text_change 21-Jul-2000 
C;Accession: JC7246 
R; Kageyama, T. 

J. Biochem. 127, 761-770, 2000 

A;Title: New world monkey pepsinogens A and C, and prochymosins . Purification, 

characterization of enzymatic properties, cDNA cloning, and molecular evolution. 

A; Reference number: JC7245 

A;Accession: JC7246 

A; Molecule type: mRNA 

A; Residues: 1-388 <KAG> 

A; Cross-references : DDBJ: AB038385 

A; Experimental source: strain NW7 91 

C; Comment: This protein, a zymogen of pepsins, is the major proteolytic enzyme 
in vertebrate gastric juices. It plays roles in gastric digestion, and is a 
useful molecular marker for clarifying the evolution of mammalian orders and 
families . 

C; Superf amily : pepsin 

C; Keywords: gastric juice; zymogen 

Query Match 13.1%; Score 351.5; DB 2; Length 388; 

Best Local Similarity 30.1%; Pred. No. 1.4e-18; 

Matches 112; Conservative 56; Mismatches 115; Indels 89; Gaps 17; 

Qy 92 YYLEMLIGTPPQKLQILVDTGSSNFAV AGTPHSYIDTYFDTERSSTYRSKGF 143 

I : I : M I I I I : I I I M M I I I I I I : II I I I I 

Db 73 YFGEISIGTPPQNFLVLFDTGSSNLWVPSVYCQSQACTSHS RFNPSASSTYSSNGQ 128 

Qy 144 DVTVKYTQGSWTGFVGEDLVTI PKGFNTSFLVNIATI FESENFFLPG IKWNGILG 198 

: : : I I I II I I I : I : I I Ml II : : : | I : I 

Db 12 9 TFSLQYGSGSLTGFFGYDTLTV QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMG 181 

Qy 199 LAYATLAKPSSSLETFFDSLVTQANIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPS 257 

II I I : : : I : : : : : M I : I : : II : : : I I : : I 

Db 182 LAY PAL SMGGAT — TAMQGMLQEGALTSPVFSFYLSNQ QGS S GGAVI FGGVDS S 233 

Qy 258 LYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFD 317 

I I I I : : I : : I I : I I I : MM: II : I II I : I I : M : II : 

Db 234 LYTGQIYWAPVTQELYWQI GIEEFLIGGQASGW-CSE — GCQAI VDTGTSLLTVPQQYMS 2 90 

Qy 318 AWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYF PKI SI YLRDENS 367 

I : II I I : I I : | I : : : 



Db 



291 AFLEA 



TGAQ EDEYGQFLVNCDSIQNLPTLTFII 323 



Qy 368 S RS FRI T I LPQL YI Q PMMGAGLN YECY- RFGI S P STNALVI GATVMEGF YVI F 419 

: I I I I : I I : I i : : I : : I : I 

Db 324 -NGVEFPLPPSSYI LSNNGYCTVGVEPTYLSSQNSQPLWILGDVFLRSYYSVF 375 

Qy 42 0 DRAQKRVGFAAS 4 31 

I I I I I I : 

Db 376 D L GNN RVG FATA 387 



RESULT 8 
B43356 

gastricsin (EC 3.4.23.3) precursor - guinea pig 

N;Alternate names: pepsin C 

C; Species: Cavia porcellus (guinea pig) 

C;Date: 03-Feb-1994 #sequence_revision 03-Feb-1994 #text_change 22-Jun-1999 
C;Accession: B43356 

R;Kageyama, T.; Ichinose, M. ; Tsukada, S.; Miki, K.; Kurokawa, K . ; Koiwai, O. ; 
Tanji, M. ; Yakabe, E. ; Athauda, S.B.; Takahashi, K. 
J. Biol. Chem. 267, 16450-16459, 1992 

A; Title: Gastric procathepsin E and progastricsin from guinea pig. Purification, 
molecular cloning of cDNAs, and characterization of enzymatic properties, with 
special reference to procathepsin E. 

A; Reference number: A43356; MUID : 92 355614 ; PMID: 1644829 
A;Accession: B43356 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-394 <KAG> 

A; Cross-references: GB:M88652; NID:gl91296; PIDN : AAA37 053 . 1 ; PID:gl91297 
A;Note: sequence extracted from NCBI backbone (NCBIN: 110805, NCBIP : 110806) 
C; Super family : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; protein digestion; 
stomach 

Query Match 12.1%; Score 324.5; DB 2; Length 394; 

Best Local Similarity 29.0%; Pred. No. 1.5e-16; 

Matches 107; Conservative 63; Mismatches 116; Indels 83; Gaps 18; 

Qy 92 YYLEMLIGTPPQKLQILVDTGSSNF AVAGTPHSYIDTYFDTERSSTYRSKGF 14 3 

I : : : : I I I I I I : I I I I I I I : : I I I I h MM: 

Db 79 YFGQISLGTPPQSFQVLFDTGSSNLWVPSVYCSSLACTTH TRFNPRDSSTYVATDQ 134 

Qy 144 DVTVKYTQGSWTGFVGEDLVTI PK-GFNTSFLVNIATIFESENFFLPG IK 192 

: : : I I I I I I I : I I I I I I hi II : 

Db 135 SFSLEYGTGSLTGVFGYDTMTIQDIQVPKQEFGLS ETE PGSDFVYAE 181 

Qy 193 WN GI LGLAYAT LAK PSSSLETFFDS L VTQAN I - PNVF SMQMC GAGL P VAG S — GTNGGSL 249 

:: I I I I I I I : : : : I I : : : : : I I : : II I : : I I 

Db 182 FDGILGLGYPGLSEGGAT — TAMQGLLREGALSQSLFSVYL GSQQGSDEGQL 231 

Qy 250 VLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLL 309 

: I I I : : II I I II : : I I : : I I : I I I I I : I : I I I : I I : I I 

Db 232 ILGGVDESLYTGDIYWTPVTQELYWQIGIEGFLIDGSASGWCSR GCQGIVDTGTSLL 28 8 

Qy 310 RLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSR 369 

: I : I : I : I : : I : : | : : | I : 



Db 289 TVPSDYLSTLVQAIGAEE — NEYGEYF VSCSSIQDLPTLTFVISGV 332 

Qy 370 SFRITILPQLYIQP MMGAGLNYECYRFGISPSTN — ALVIGATVMEGFYVIFDRA 422 

: I II I : I I : I I : : I : : I : : I I 

Db 333 — EFPLSPSAYILSGENYCMVGLESTY VSPGGGEPVWILGDVFLRSYYSVYDLA 384 

Qy 423 QKRVGFAAS 431 

I I I I I : 

Db 385 NNRVGFATA 393 



RESULT 9 
JC7575 

pepsinogen A - bullfrog 

C; Species: Rana catesbeiana (bullfrog) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 03-Aug-2001 
C; Access ion: JC7575 

R;Ikuzawa, M. ; Inokuchi, T . ; Kobayashi, K. ; Yasumasu, S. 
J. Biochem. 129, 147-153, 2001 

A;Title: Amphibian pepsinogens: Purification and characterization of Xenopus 

pepsinogens, and molecular cloning of Xenopus and bullfrog pepsinogens. 

A;Reference number: JC7573; MUID: 21064922 ; PMID: 11134969 

A; Contents: Stomach 

A;Accession: JC7575 

A;Molecule type: mRNA 

A; Residues: 1-385 <IKU> 

A; Cross-references : DDBJ: AB045376 

C; Comment: This protein is a zymogen for gastric aspartic proteinase, with 

pepsin-like activity. 

C; Genetics : 

A; Gene : PgA 

C; Super family : pepsin 

C; Keywords: stomach; zymogen 

Query Match 11.9%; Score 320; DB 2; Length 385; 

Best Local Similarity 27.8%; Pred. No. 3.2e-16; 

Matches 111; Conservative 67; Mismatches 147; Indels 74; Gaps 15; 

Qy 50 GT PAERHADGLALALE P ALAS PAGAAN FLAMVDNLQGD S GRG Y YLEML I GT P PQKLQ I LV 109 

I : : I . I I : I I : I : I I I :: I I I I I I : : 

Db 39 GDYLKKHHYNPATKYFPSLAQASG EPLQNYMDIEYFGTISIGTPPQSFTVIF 90 

Qy 110 DTGSSNFAVAGTPHSYIDT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDL 162 

I I I I I I I I I : | : : : I I I : : : | : : : | I I : I I : I I 

Db 91 DTGSSNLWV PSVYCSSPACTNHHMFNPQQSSTFQATNTPVSIQYGTGSMSGFLGYDT 147 

Qy 163 WIPKGFNTSFLVNIATIFESE-NFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQ 221 

I : I I : : I I II : : I I I I II : : I I II I I : : I 

Db 148 VQVG---NIQITNQIFGLSQSEPGSFLYYSPFDGILGLAFPSLA — SSQATPVFDNMWNQ 202 

Qy 222 ANIP-NVFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILK 280 

I I : : I I : : : I : I : : I I : : I I I : : : I : I I : I I : 

Db 203 GLI PQDLFSVYL SSQGQSGSFVLFGGVDTSYYTGNLNWVPLTAETYWQITVDS 255 

Qy 281 LEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGS 340 

: I I I I : : I I I I : I I : I I I I : : I : : I : I : 

Db 256 ISIGGQVIACS GSCSAIVDTGTSLLAGPSTPI-ANIQYYIGAN QDSNGQYV — 305 



Qy 



341 



QLACWTNSETPWSYFP KISIYLRDENSS — RSFRITILPQLYIQPMMGAGLN 390 



Db 



306 



-INCNNISNMPTWFTINGVQYPLPASAWRQSQQSCTSGFQAMNLP 351 



QY 



391 



YECYRFGI S P STNALVI GATVMEGFYVI FDRAQKRVGFA 42 9 



Db 



352 



T S SGDLWI LGDVFI REYYWFDRANNYVAMA 382 



RESULT 10 
REMSK 

renin (EC 3.4.23.15) precursor, renal - mouse 

N;Alternate names: angiotensin- forming enzyme; angiotensinogenase; renin 1 
C; Species: Mus musculus (house mouse) 

C;Date: 30-Jun-1987 #sequence_revision 30-Jun-1987 #text_change 18-Jun-1999 
C;Accession: A00989; S07636; A22766; A22058; 157576; A05137; JH0083 
R;Holm, I.; Ollo f R. ; Panthier, J. J.; Rougeon, F. 
EMBO J. 3, 557-562, 1984 

A; Title: Evolution of aspartyl proteases by gene duplication: the mouse renin 

gene is organized in two homologous clusters of four exons . 

A;Reference number: A00989; MUID : 84182525 ; PMID: 6370686 

A; Accession: A00989 

A; Molecule type: DNA 

A; Residues: 1-402 <HOL> 

A;Cross-references : EMBL:X00850 

R;Kim, W.S.; Murakami, K. ; Nakayama, K. 

Nucleic Acids Res. 17, 9480, 1989 

A; Title: Nucleotide sequence of a cDNA coding for mouse Renl preprorenin. 
A;Reference number: S07636; MUID : 90067953; PMID:2685761 
A;Accession: S07636 
A;Molecule type: mRNA 
A; Residues: 1-4 02 <KIM> 

A;Cross-references: EMBL:X16642; NID:g53930; PIDN : CAA34636 . 1 ; PID:g53931 
R;Mullins, J. J.; Burt, D.W.; Windass, J.D.; McTurk, P.; George, H.; Brammar, 
W.J. 

EMBO J. 1, 1461-1466, 1982 

A; Title: Molecular cloning of two distinct renin genes from the DBA/2 mouse. 

A; Reference number: A90968; MUID : 84207899; PMID: 6327270 

A;Accession: A22766 

A;Molecule type: mRNA 

A; Residues: 269-314 , 1 D ' ,316 <MUL> 

R;Panthier, J.J.; Dreyfus, M. ; Roux, D.T.L.; Rougeon, F. 
Proc. Natl. Acad. Sci. U.S.A. 81, 5489-5493, 1984 ■ 

A;Title: Mouse kidney and submaxillary gland renin genes differ in their 5 1 
putative regulatory sequences. 

A;Reference number: A22058; MUID: 84298161; PMID: 6089205 
A;Accession: A22058 
A;Molecule type: DNA 
A; Residues: 1-30 <PAN> 

R;Field, L.J.; Philbrick, W.M. ; Howies, P.N.; Dickinson, D.P.; McGowan, R.A. ; 
Gross, K.W. 

Mol. Cell. Biol. 4, 2321-2331, 1984 

A;Title: Expression of tissue-specific Ren-1 and Ren-2 genes of mice: 
Comparative analysis of 5 '-proximal flanking regions. 
A; Reference number: 157576; MUID: 85085936; PMID: 6392850 
A;Accession: 157576 



A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-31 <RES> 

A;Cross-references: GB:K02800; NID:g200689; PIDN : AAA4 0044 . 1 ; PID:g200690 

C; Comment: The only known function of renal renin is to release angiotensin I 

from angiotensinogen in the plasma, initiating a cascade of reactions that 

produces an elevation of blood pressure and increased sodium retention by the 

kidney. 

C; Comment: Renal renin is synthesized by the juxtaglomerular cells of the kidney 
in response to decreased blood pressure and sodium concentration. 
C; Genetics : 
A; Gene: Ren-1 

A;Introns: 31/2; 81/3; 123/1; 162/3; 228/2; 268/2; 316/3; 349/3 
C; Super family : pepsin 

C; Keywords: aspartic proteinase; blood pressure control; glycoprotein; 
hydrolase; kidney; plasma 

F; 1-21/Domain : signal sequence #status predicted <SIG> 
F;22-64/Domain: propeptide #status predicted <PRO> 
F; 65-402/Product : renin ((status predicted <MAT> 

F; 69, 139, 320/Binding site: carbohydrate (Asn) (covalent) #status predicted 
F; 102, 287/Active site: Asp ((status predicted ; 

Query Match 11.9%; Score 320; DB 1; Length 402; 

Best Local Similarity 28.6%; Pred. No. 3.4e-16; 

Matches 126; Conservative 66; Mismatches 181; Indels 68; Gaps 21; 

Qy 10 LPLLAQWLLRAAPELAPAPFTLPLRVAAATNRVVAPTPG-P 65 

: I I I I I : I I : I I I I : I II I I : I 

Db 6 MPLWALLLL WS PCTFSLPTRTATFERI PLKKMPSVREI LEERGVDMTRLSAEWGV 60 

Qy 66 PA LAS PAGAANFLAMVDNLQGDS GRGYYLEMLI GT P PQKLQI LVDTGS SN FAV 118 

I : III I : I II 111:111111 : : : I I I I : I I 

Db 61 FTKRPSLTNLTSPWLTNYL NTQ Y YGEI GI GT P PQT FKVI FDTGSANLWV 110 

Qy 119 AGTPHSY 1DTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTS 172 

I I I : : : : I I : I I I I : I I I I : : I I I : I : 

Db 111 PSTKCSRLYLACGIHSLYESSDSSSYMENGSDFT1HYGSGRVKGFLSQDSVTV-GGITVT 169 

Qy 173 FLVNIAT I FES EN FFLPGI KWNGI LGLAYATLAKP S S S LET FFDS LVTQANI - PNVFSMQ 231 

I II | : : | : | | : : I : : I I : : : I : III: 

Db 170 QTFGEVTELPLIPFML — AKFDGVLGMGFP — AQAVGGVT P VFDH I L S Q GVLKEEVF S VY 225 

Qy 232 MCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLD 291 

I I II : I I I I : I I : I : I I : : I I : : s I I I 
Db 226 Y NRGSHLLGGEWLGGSDPQHYQGNFHYVSISKTDSWQITMKGVSVG- — SSTLL 277 

Qy 292 CREYNADKAIVDSGTTLLRLPQWFDAWEAV-T^lRASLIPEFSDGFWTGSQLACWTNSET 350 

II I : | | : | : : : I : : : I : I : I I : : I I : 
Db 278 CEEGCA — VWDTGSSFISAPTSSLKLIMQALGAKEKRIEEY WNC SQV 324 

Qy 351 PWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGL-NYECYRFGISPSTNAL-VIG 408 

I I I I I | : : : : : I I : III : hi 

Db 325 P — TLPDISFDL GGRAYTLSSTDYVTjQYPNRRDKLCTLALHAMDIPPPTGPVWVLG 378 



Qy 409 ATVMEGFYVI FDRAQKRVGFA 429 

II: II III I : I II 
Db 379 ATFI RKFYTEFDRHNNRI GFA 399 



RESULT 11 
S66516 



oryzasin (EC 3.4.23.-) precursor - rice 
N;Alternate names: aspartic proteinase 1 
C; Species: Oryza sativa (rice) 

C;Date: 28-Oct-1996 #sequence_revision 13-Mar-1997 #text_change 20-Jun-2000 
C;Accession: S66516; S66517 

R;Asakura, T.; Watanabe, H.; Abe, K.; Arai, S. 
Eur. J. Biochem. 232, 77-83, 1995 

A;Title: Rice aspartic proteinase, oryzasin, expressed during seed ripening and 
germination, has a gene organization distinct from those of animal and microbial 
aspartic proteinases. 

A;Reference number: S66516; MUID: 96048031; PMID: 7556174 
A;Accession: S66516 
A;Molecule type: DNA 
A; Residues: 1-509 <ASA> 

A;Cross-references: EMBL:D32165; NID: g511665; PIDN: BAA06876 . 1 ; PID:gl030715 
A;Accession: S66517 
A;Molecule type: mRNA 
A; Residues: 1-509 <ASZ> 

A;Cross-references: EMBL:D32144; NID : gl255684 ; PIDN : BAA06875 . 1 ; PID:gl711289 
C; Comment: The pair of saposin repeat homology domains tagged SAP1 and SAP 2 
represent a cyclical permutation of a single saposin repeat. 
C; Genetics : 

A;Introns: 119/3; 140/1; 171/3; 209/2; 265/3; 279/1; 300/3; 338/3; 360/2; 412/3; 
452/3; 482/2 

C; Superf amily : oryzasin; saposin repeat homology 

C; Keywords: aspartic proteinase; hydrolase 

F; 1-20/Domain: signal sequence #status predicted <SIG> 

F;21-68/Domain: propeptide #status predicted <PRO> 

F; 68-509/Product : aspartic proteinase 1 #status predicted <MAT> 

F; 316-361/Domain : saposin repeat homology ftstatus atypical <SAP1> 

F; 3 7 0-420 /Domain : saposin repeat homology ftstatus atypical <SAP2> 

F; 103, 290/Active site: Asp ftstatus predicted 

Query Match 11.7%; Score 313.5; DB 2; Length 509; 

Best Local Similarity 23.0%; Pred. No. 1.5e-15; 

Matches 127; Conservative 75; Mismatches 179; Indels 171; Gaps 19; 

Qy 3 ALARALLLPLIAQWLLRAAPEl^PAPFTLPLRVAAATNRVVAP 62 

: : I I I : I I II I : I II : I I I I I I I I 

Db 5 S VALVLLAAVLLQALLPAS AEEGLVRI ALKKRP I DEN S RVAARLS G EEGARRLGL 59 

Qy 63 ALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSN 115 



Db 



60 



RGANSLGGGGGEGDIVALKNYMNAQ YFGEIGVGTPPQKFTVIFDTGSSNLWVPSAK 



115 



Qy 



116 



— FAVAGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSF 

I : : I II : : : I I II : I : : I II II MM: 

CYFSIACFFHS RYKSGQSSTYQKNGKPAAIQYGTGSIAGFFSEDSVTVGD 



173 



Db 



116 



165 



Qy 



174 



LVNIATI FESENFF LPGI KWNGILGLAYATLAKPSSSLETFFDSLVTQANI 



224 



Db 



166 



LWKDQEFIEATKEPGLTFMVAKFDGILGLGFQEISVGDA V 



206 



Qy 225 PNVFSMQMCG-AGLPVAGSGTN GGSLVLGGIEPSLYKGDIWYTPIKEEWYYQI 27 6 

I : I I II I I I : I I I :: I I I I I : I I : : : I : I 

Db 207 PVWYKMVEQGLVSEPVFSFWFNRHSDEGEGGEIVFGGMDPSHYKGNHTYVPVSQKGYWQF 266 

Qy 277 EILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPE 331 

I : : I M : : I : I I I I I I : I I I : : I : : : : 

Db 267 EMGDVLI GGKTTGF- CA — SGCSAIADSGTSLLAGPTAIITEINEKIGATGWSQECKTV 323 

g y 332 FSDGF 336 

I : I 

Db 324 VSQYGQQILDLLLAETQPSKICSQVGLCTFDGKHGVSAGIKSWDDEAGESNGLQSGPMC 383 

Qy 337 WTGSQLACWTNSETPWSY FPKISIYLRD 364 

I : I I I : : I I : I I : 

Db 384 NACEMAWWMQNQLAQNKTQDLI LNYINQLCDKLPS PMGES SVDCGSLASMPEI S FTI GA 443 

Qy 365 ENSSRSFRITILPQLYIQPMMGAGLNYECY RFGISPSTNAL-VIGATVMEGFYVIF 419 

: : : I : I I : I I : I II I : : I I : : : I 

Db 444 K KFALKPEEYIL-KVGEGAAAQCISGFTAMDIPPPRGPLWILGDVFMGAYHTVF 496 

Qy 420 DRAQKRVGFAAS 431 

I : I I I I I I 
Db 497 DYGKMRVGFAKS 508 



RESULT 12 
A24608 

gastricsin (EC 3.4.23.3) precursor - rat 
N; Alternate names: pepsinogen C 
N;Contains: pepsin A (EC 3.4.23.1) precursor 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 30-Jun-1988 #sequence_revision 05-Aug-1994 #text_change 18-Jun-1999 
C;Accession: A33510; A24608; C22434; A05145; A61298 

R;Ishihara, T.; Ichihara, Y. ; Hayano, T . ; Katsura, I.; Sogawa, K. ; Fujii- 

Kuriyama, Y.; Takahashi, K. 

J. Biol. Chem. 264, 10193-10199, 1989 

A; Title: Primary structure and transcriptional regulation of rat pepsinogen C 
gene . 

A; Reference number: A33510; MUID : 89255508 ; PMID:2722863 

A;Accession: A33510 

A;Molecule type: DNA 

A; Residues: 1-392 <ISH> 

A; Cross-ref erences : GB:M25985 

R; Ichihara, Y.; Sogawa, K. ; Morohashi, K. ; Fuj ii-Kuriyama, Y. ; Takahashi, K. 
Eur. J. Biochem. 161, 7-12, 1986 

A; Title: Nucleotide sequence of a nearly full-length cDNA coding for pepsinogen 
of rat gastric mucosa. 

A;Reference number: A24608; MUID: 87 054020; PMID:3780741 
A; Accession: A24608 
A; Molecule type: mRNA 
A; Residues: 1-392 <ICH> 

A; Cross-references : GB:X04 644 ; NID:g56880; PIDN : CAA28305 . 1 ; PID:g568 81 
R; Ichihara, Y.; Sogawa, K. ; Takahashi, K. 
J. Biochem. 98, 483-492, 1985 

A; Title: Isolation of human, swine, and rat prepepsinogens and calf 
preprochymosin, and determination of the primary structures of their NH2- 
terminal signal sequences. 



A;Reference number: A22434; MUID : 86059312 ; PMID:2415509 
A; Accession : C22434 
A;Molecule type: protein 

A;Residues: 1-19/ ' X 1 , 21-23, 'X 1 , 25-29 <IC2> 

R;Arai, K.M. ; Muto, N.; Tani, S.; Akahane, K. 

Biochim. Biophys . Acta 788, 256-261, 1984 

A; Title: The N-terminal sequence of rat pepsinogen. 

A; Reference number: A05145; MUID : 84257697 ; PMID: 6743670 

A; Accession: A05145 

A; Molecule type: protein 

A;Residues: 17-30, 1 Q ' , 32-102, 'A 1 , 104-108, ' V , 110-112 <ARA> 
A; Experimental source: Wistar strain 
R;Ichihara, Y.; Sogawa, K.; Takahashi, K. 
J. Biochem. 92, 603-606, 1982 

A; Title: Rat gastric prepepsinogen: in vitro synthesis and partial amino- 
terminal signal sequence . 

A; Reference number: A61298; MUID: 83030750; PMID: 6182139 
A;Accession : A61298 
A;Molecule type : protein 

A; Residues: 1, ' XX 1 , 4-6, 1 X » , 8-9, ' X ' , 11, 'X 1 , 13-14, * XXX', 18- 
19, 'X' ,21, 'X' ,23, 'XX' ,26, 'X' <IC3> 

C; Comment: This enzyme has more restricted specificity than pepsin A. It is the 
major form of pepsinogen in rat gastric mucosa. 
C; Genetics : 

A;Introns: 20/2; 73/3; 113/1; 152/3; 219/2; 259/2; 309/3; 342/3 

A; Note: there are at least two very similar genes for gastricsin in rat 

C; Super family : pepsin 

C; Keywords: aspartic proteinase; gastric juice; hydrolase; protein digestion; 
stomach 

F; 1-16/Domain : signal sequence #status experimental <SIG> 
F; 17-392/Product : pepsinogen #status experimental <MAT> 
F; 17- 62 /Domain: activation peptide #status experimental <ACT> 
F; 94 , 280/Active site: Asp #status predicted 

F; 107-112, 270-275, 314-347/Disulf ide bonds: #status predicted 

Query Match 11.6%; Score 313; DB 1; Length 392; 

Best Local Similarity 29.5%; Pred. No. l.le-15; 

Matches 105; Conservative 56; Mismatches 139; Indels 56; Gaps 16; 

Qy 92 YYLEMLIGTPPQKLQILVDTGSSNFAV AGTPHS YI DTYFDTERS STYRS KGF 143 

I : I : I I I I I I : I I I I I I I I III: I : : I I I I :: I 

Db 76 YFGEI S I GT P PQN FLVLFDTGS SNLWVS SVYCQS EACTTHA RFNPSKSSTYYTEGQ 131 

Qy 144 DVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPG IKWNGILG 198 

: : : I I I I I I I I : I : I I III II : : : | | : | 

Db 132 TFSLQYGTGSLTGFFGYDTLTV QSIQVPNQEFGLSEN — EPGTNFVYAQFDGIMG 184 

Qy 199 LAY AT LAK PSSSLETFFDS L VT Q AN I PN VF SMQMC GAGL P VAG S — GTNGGSLVLGGIEP 256 

III I : I I : : : I : I I II I : I I I : I I I : : 

Db 185 LAYPGLS — SGGATTALQGMLGE GALSQPLFGVYL GSQQGSNGGQIVFGGVDK 235 

Qy 257 SLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVF 316 

. : I I I : I : I : : I I : I I I II I : I : I I I : I I : I I : I : 

Db 236 NLYTGEITWVPVTQELYWQITIDDFLIGDQASGW-CSSQGC-QGIVDTGTSLLVMPAQYL 293 

Qy 317 DAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 376 

: : : : I : : I : : | : | | : | | : : 



Db 294 SELLQTIGAQE — GEYGEYF VSCDSVSS LPTLSFVL NGVQFPLS 335 

Qy 377 PQLY- 1 QPMMGAGLN YECYRFGI S PSTNALVI GATVMEGFYVI FDRAQKRVGFAAS 431 

I I I I : I :: I : : I I I I : I I I I 

Db 336 PSSYIIQEDNFCMVGLESISLTSESGQPLWILGDVFLRSYYAIFDMGNNKVGLATS 391 



RESULT 13 
A41443 

pepsin (EC 3.4.23.-) precursor, embryonic - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 05-Jun-1992 #sequence_revision 05~Jun-1992 #text_change 21-Jul-2000 
C /Accession: A4144 3 

R;Hayashi, K. ; Agata, K. ; Mochii, M. ; Yasugi, S.; Eguchi, G. ; Mizuno, T. 
J. Biochem. 103, 290-296, 1988 

A;Title: Molecular cloning and the nucleotide sequence of cDNA for embryonic 

chicken pepsinogen: phylogenetic relationship with prochymosin. 

A;Reference number: A41443; MUID : 88227903; PMID:3131317 

A;Accession: A41443 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 1-383 <HAY> 

A; Cross-references: GB:D00215; NID : g2760810 ; PIDN : BAA00153 . 1 ; PID:g222853 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; hydrolase; protein digestion 

Query Match 11.5%; Score 310; DB 2; Length 383; 

Best Local Similarity 26.8%; Pred. No. 1.8e-15; 

Matches 106; Conservative 63; Mismatches 136; Indels 90; Gaps 15; 

Qy 56 HA — DGLALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGS 113 

I I I I : I I I III 11:111111 : : I I I I 

Db 55 HAFPDVLTWTEPLL NTLDM EYYGTISIGTPPQDFTWFDTGS 97 

Qy 114 SNFAVAG TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGF 169 

II I : I I : : I I M : I I : : : : I I I I I I I I : 

Db 98 SNLWVPSVSCTSPACQSHQMFNPSQSSTYKSTGQNLSIHYGTGDMEGTVGCDTVTVASLM 157 

Qy 170 NTSFLWIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PNVF 228 

: I : I : : I I II: : I : : I I I I I I : I I : : | | : : | : : : I : I 
Db 158 DTNQLFGLST-SEPGQFFV-YVKFDGILGLGYPSLA — ADGITPVFDNMVNESLLEQNLF 213 

Qy 229 SMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSL 288 

I : : : I : I I I I : I : I I : I : : I : I I : : : I : 

Db 214 SVYLSREPM GSMWFGGI DES YFTGS INWI PVS YQGYWQI SMDS 1 1 VNKQEI 265 

Qy 289 NLDCREYNADKAI VT)SGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNS 348 

: : I I : I : I I : I : I : : I I I 

Db 266 ACS S GCQAI I DT GT S LVAG PAS D I ND I Q S AVG ANQ 300 

Qy 34 9 ETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY ECY 394 

II I | : | : : : | | : | I 

Db 301 NTYGEY SVNCSHILAMPDWF — VI G- GI Q YP VPALAYTEQNGQGTCM 345 

Qy 395 RFGI S PSTNALVI GATVMEGFYVI FDRAQKRVGFA 429 

: I : : : I : : I I I I I I I I I I 

Db 34 6 S S FQNS SADLWI LGDVFI RVYYS I FDRANNRVGLA 380 



RESULT 14 
KHHUD 

cathepsin D (EC 3.4.23.5) precursor [validated] - human 
N; Alternate names: preprocathepsin D 
C; Species: Homo sapiens (man) 

C;Date: 28-Dec-1987 #sequence_revision 28-Dec-1987 #text_change 15-Sep-2000 

C;Accession: A25771; S30749; PC2066; 159236; 157716 

R;Faust, P.L.; Kornfeld, S.; Chirgwin, J.M. 

Proc. Natl. Acad. Sci. U.S.A. 82, 4910-4914, 1985 

A; Title: Cloning and sequence analysis of cDNA for human cathepsin D. 
A;Reference number: A25771; MUID : 85270436; PMID:3927292 
A;Accession: A25771 
A; Molecule type: mRNA 
A; Residues: 1-412 <FAU> 

A; Cross-references: EMBL:M11233; NID:gl81179; PIDN : AAB59529 . 1 ; PID:gl81180 

R;Westley, B.R.; May, F.E.B. 

Nucleic Acids Res. 15, 3773-3786, 1987 

A; Title: Oestrogen regulates cathepsin D mRNA levels in oestrogen responsive 
human breast cancer cells. 

A; Reference number: S30749; MUID: 87231068 ; PMID: 3588310 
A; Access ion: S3074 9 
A;Molecule type: mRNA 
A; Residues: 1-412 <WES> 

A;Cross-references: EMBL : X05344 ; NID:g29677; PIDN : CAA2 8 955 . 1 ; PID:g29678 
R;May, F.E.B. ; Smith, D.J.; Westley, B.R. 
Gene 134, 277-282, 1993 

A; Title: The human cathepsin D-encoding gene is transcribed from an estrogen- 
regulated and a constitutive start point. 
A; Reference number: PC2066; MUID : 94085791 ; PMID: 8262386 
A;Accession: PC2066 
A;Molecule type: DNA 
A; Residues: 1-23 <MAY> 

A;Cross-references: GB:L12980; NID:g291930; PIDN : AAA16314 . 1 ; PID:g455429 
A; Experimental source: MCF-7 cell 
R;Cavailles, V.; Augereau, P.; Rochefort, H. 
Proc. Natl. Acad. Sci. U.S.A. 90, 203-207, 1993 

A;Title: Cathepsin D gene is controlled by a mixed promoter, and estrogens 
stimulate only TATA-dependent transcription in breast cancer cells. 
A; Reference number: 159236; MUID: 93126342 ; PMID: 8419924 
A;Accession: 159236 

A; Status: translation not shown; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-22 <CAV1> 

A; Cross-references: GB:S52557; NID:g263124; PIDN : AAD13 8 68 . 1 ; PID:g4261568 
R;Augereau, P.; Miralles, F. ; Cavailles, V.; Gaudelet, C. ; Parker, M.; 
Rochefort, H. 

Mol. Endocrinol. 8, 693-703, 1994 

A; Title: Characterization of the proximal estrogen-responsive element of human 
cathepsin D gene. 

A;Reference number: 157716; MUID : 95021301 ; PMID:7935485 
A;Accession: 157716 

A; Status: translation not shown; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-22 <CAV2> 

A;Cross-references: GB:S74689; NID:g786350; PIDN : AAD14 156 . 1 ; PID:g4261856 



R;Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Erickson, J.W. 
submitted to the Brookhaven Protein Data Bank, April 1993 
A; Reference number: A51839; PDB: 1LYA 

A; Contents: annotation; X-ray crystallography, 2.5 angstroms, residues 65- 
161;170-241 

R; Baldwin, E.T.; Bhat, T.N.; Gulnik, S.; Erickson, J.W. 
submitted to the Brookhaven Protein Data Bank, April 1993 
A; Reference number: A51840; PDB : 1LYB 

A; Contents: annotation; X-ray crystallography, 2.5 angstroms, with inhibitor 
residues 65-161; 170-241 

R; Baldwin, E.T.; Bhat, T.N.; Gulnik, S-; Hosur, M.V. ; Sowder II, R.C.; Cachau, 
R.E.; Collins, J.; Silva, A.M.; Erickson, J.W. 
Proc. Natl. Acad. Sci. U.S.A. 90, 6796-6800, 1993 

A; Title: Crystal structures of native and inhibited forms of human cathepsin D: 

implications for lysosomal targeting and drug design. 

A; Reference number: A48229; MUID: 93342076; PMID: 8393577 

A; Contents: annotation; X-ray crystallography, 2.5 angstroms 

C; Comment: Cathepsin D is a ubiquitous lysosomal proteinase. 

C; Comment: In addition to the propeptide, residues 163-168 and 411-412 are 

proteolytically removed. Residues 169 and 170 are also partially removed. 

C; Comment: The carbohydrate bound to 134-Asn contains a mannose-6-phosphate that 

is bound near 267-Lys and the phosphotransferase recognition region. 

C; Genetics : 

A; Gene: GDB: CTSD 

A; Cross-references: GDB: 120512; OMIM: 116840 
A; Map position: llpl5 . 5-llpl5 . 5 
C; Function: 

A; Description: limited specificity endopeptidase 
A; Pathway: intracellular protein degradation 
C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; glycoprotein; hydrolase; lysosome; protein 
degradation 

F; 1-20/Domain : signal sequence #status predicted <SIG> 

F; 21- 64 /Domain : propeptide #status predicted <PRO> 

F; 65-162, 169-410/Product: cathepsin D #status experimental <MAT> 

F; 267, 329-356/Region: phosphotransferase recognition 

F; 91-160, 110-117,286-290, 329-366/Disulf ide bonds: #status experimental 
F; 97, 295/Active site: Asp #status experimental 

F; 134 , 2 63 /Binding site: carbohydrate (Asn) (covalent) tstatus experimental 

Query Match 11.5%; Score 308.5; DB 1; Length 412; 

Best Local Similarity 27.1%; Pred. No. 2.5e-15; 

Matches 121; Conservative 75; Mismatches 180; Indels 71; Gaps 22; 

Qy 9 LLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPG PGTPAERHADGLAL 62 

I I I I I I I I I I : I I : I : : I | : : : : 

Db 6 LLPLAL — CLLAAP — ASALVRIPLHKFTSIRRTMSEVGGSVEDLIAKGPVSKYSQAVPA 61 

Qy 63 ALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTP 122 

I : I I : : I I I : I I I I I I : : I I I I I I 

Db 62 VTEGPI — PEVLKNYM DAQYYGEIGIGTPPQCFTWFDTGSSNLWVPSIH 109 

Qy 123 HSYIDT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIP — KGFNTSFL 174 

: I : : : : : | I I I | : I I I : I : : : I I : : I : I I 

Db 110 CKLLDIACWIHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASAL 169 

Qy 175 — VNIATIFESENFFLPGI KWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PN 22 6 



I : I III | : : | | | | : | | : : : : : I I : I : I : I 

Db 170 GGVKVERQVFGEATKQPGITFIAAKFDGILGMAYPRIS-- WNVLPVFDNLMQQKLVDQN 227 

Qy 227 VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQ 286 

: I I I I I I : I I I : I I I : I : : I : I : : : : I : 

Db 228 IFSFY LSRDPDAQPGGELMLGGTDSKYYKGSLSYLNVTRKAYWQVHLDQVEV-AS 281 

Qy 287 SLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWT 346 

I I I : I : | | M : | | : | : | III : I : : I 

Db 282 GLTL-CKE — GCEAIVDTGTSLMVGP VDEVRELQKAIGAVPLIQGEY MIPC — 329 

Qy 347 NSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRF GISPSTN 403 

I I I : : I : : : : : | : | : | | I I I : 

Db 330 EKVSTLPAITLKL GGKGYKLS — PEDYTLKVSQAGKTLCLSGFMGMDIPPPSG 380 

Qy 4 04 AL-VT GATVMEGFYVI FDRAQKRVGFA 42 9 

I : : I : : I : I I I I I I I I 
Db 381 PLWI LGDVFI GRYYTVFDRDNNRVGFA 4 07 



RESULT 15 
KHMSD 

cathepsin D (EC 3.4.23.5) precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 31-Dec-1991 #sequence_revision 31-Dec-1991 #text_change 18-Jun-1999 
C;Accession: 148278; S14704; S12587 

R;Hetman, M. ; Perschl, A.; Saftig, P.; Von Figura, K. ; Peters, C. 
DNA Cell Biol. 13, 419-427, 1994 

A;Title: Mouse cathepsin D gene: molecular organization, characterization of the 
promoter, and chromosomal localization. 

A; Reference number: 148278; MUID : 94280622 ; PMID: 8011168 
A;Accession: 148278 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-410 <RES> 

A;Cross-references: EMBL:X68378; NID:g50302; PIDN: CAA48453 . 1; PID:g817945 
R;Diedrich, J.F.; Staskus, K.A.; Retzel, E.F.; Haase, A.T. 
Nucleic Acids Res. 18, 7184, 1990 

A; Title: Nucleotide sequence of a cDNA encoding mouse cathepsin D. 
A;Reference number: S14704; MUID : 91088345 ; PMID:2263503 
A;Accession: SI 4 7 04 
A;Molecule type: mRNA 
A; Residues: 1-410 <DIE> 

A;Cross-references: EMBL : X53337 ; NID:g50300; PIDN : CAA37 423 . 1; PID:g50301 
R;Grusby, M.J.; Mitchell, S.C.; Glimcher, L.H. 
Nucleic Acids Res. 18, 4008, 1990 

A;Title: Molecular cloning of mouse cathepsin D. 
A;Reference number: S12587; MUID : 90326544 ; PMID:2374732 
A; Accession : S 125 8 7 
A;Molecule type: mRNA 
A; Residues: 1-410 <GRU> 

A;Cross-references: EMBL:X52886; NID:g50298; PIDN : CAA37067 . 1 ; PID:g50299 
C; Genetics : 

A;Introns: 23/2; 76/3; 118/1; 157/3; 233/2; 274/2; 322/3; 355/3 
C; Function: 

A; Description : limited specificity endopeptidase 
A; Pathway: intracellular protein degradation 



C; Superf amily : pepsin 

C; Keywords: aspartic proteinase; glycoprotein; hydrolase; lysosome; protein 
degradation 

F; 1-20/Domain: signal sequence #status predicted <S1G> 
F;21-64/Domain: propeptide #status predicted <PRO> 

F; 65-4 10/Product : cathepsin D, single-chain form #status predicted <MAT> 
F; 91-160, 110-117,284-288, 327-364/Disulf ide bonds: #status predicted 
F; 97, 2 93 /Active site: Asp #status predicted 

F;134,261/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 11.4%; Score 306.5; DB 1; Length 410; 

Best Local Similarity 27.5%; Pred. No. 3.6e-15; 

Matches 103; Conservative 64; Mismatches 123; Indels 85; Gaps 



15; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



92 YYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDT 



YFDTERSSTYRSKGFDV 145 



II : : I I I I I I 



I I I I II I 



: I 



I I I I 



I 



79 YYGDIGIGTPPQCFTWFDTGSSNLWVPSIHCKILDIACWVHHKYNSDKSSTYVKNGTSF 138 



146 TVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT IFESENFFLPGI 



: I | | : | : : : | | : : I 



I 



I I I 



I I 



KWNGIL 197 
I : : I I I 



139 DIHYGSGSLSGYLSQDTVSVPCKSDQSKARGIKVEKQIF-GEATKQPGIVFVAAKFDGIL 197 
198 GLAYATLAKPSSSLETFFDSLVTQANI-PNVFSMQMCGAGLPVAGSGTNGGSLVLGGIEP 256 



I : I 



I I : I : I 



I : I I 



I 



I I I I : I I I : 



LNRDPEGQPGGELMLGGTDS 250 



198 GMGYPHI S — VNNVL P VFDN LMQQKLVDKN I FS FY 

257 SLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVF 316 



: : I : : I : I : : : I I : I : I I I : 



I I I I : II : I 



251 KYYHGELSYLNVTRKAYWQVHMDQLEVGNE-LTL-CK — GGCEAIVDTGTSLLVGPVEEV 306 

317 DAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 376 

: : I : I I : I : : : I 

307 KELQKAIGAVPLI QGEYMIPCEKVSSL 333 



377 PQLYIQPMMGAGLNYEC YRFGIS 



P STNALVI GATVMEG 414 



I : I : : : I 



I I 



I 



334 PTVYLK — LG-GKNYELHPDKYILKVSQGGKTICLSGFMGMDIPPPSGPLWILGDVFIGS 390 



Qy 



Db 



415 FYVI FDRAQKRVGFA 429 



: I : I I I 



I I 



391 YYTVFDRDNNRVGFA 405 



Search completed: March 4, 2004, 15:40:59 
Job time : 29.1043 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on : 



March 4, 2004, 15:39:01 ; Search time 57.8617 Seconds 

(without alignments) 
1890.324 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-668-314C-2 
2687 

1 MGALARALLLPLLAQWLLRA, 



RPRDPEWNDESSLVRHRWK 518 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



809742 seqs, 211153259 residues 



Total number of hits satisfying chosen parameters 



809742 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : 



Published 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 



/ cgn2_ 
/ cgn2_ 
/ cgn2_ 
/cgn2_ 
/ cgn2_ 
/ cgn2_ 
/ cgn2_ 
/ cgn2 
/ cgn2_ 
/cgn2 
/cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 



_Applications_AA: * 
6/ptodata/2/pubpaa/US07_PUBCOMB . pep : * 
6/ptodata/2/pubpaa/PCT_NEW_PUB.pep:* 
6/ptodata/2/pub.paa/US06_NEW_PUB.pep: * 
6/ptodata/2/pubpaa/US06_PUBCOMB.pep: * 
6/ptodata/2/pubpaa/US07_NEW_PUB.pep:* 
6/ptodata/2 /pubpaa/ PCTUS_PUBCOMB . pep : 
6/ptodata/2/pubpaa/US08_NEW_PUB.pep: * 
6/ptodata/2/pubpaa/US08_PUBCOMB.pep:* 
6/ptodata/2/pubpaa/US09A_PUBCOMB . pep : 
_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US09C_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US09_NEW_PUB.pep: 
_6/ptodata/2/pubpaa/US10A_PUBCOMB.pep 
_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep 
_6/ptodata/2/pubpaa/USl0C_PUBCOMB.pep 
6/ptodata/2/pubpaa/US10_NEW_PUB.pep: 
6/ptodata/2/pubpaa/US60_NEW_PUB . pep : 
6/ptodata/2/pubpaa/US60_PUBCOMB.pep: 



* 

* 

V 

•k 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-09-794-927-2 

; Sequence 2, Application US/09794927 

; Patent No. US20010016324A1 

; GENERAL INFORMATION: 

; APPLICANT : Gurney, Mark E. 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Bienkows ki , Michael J . 
Heinrikson, Robert L . 
Parodi, Luis A. 
Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND 

TITLE OF INVENTION: USES 
TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280FG 
CURRENT APPLICATION NUMBER: US/ 09/794 , 927 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver . 2.0 
SEQ ID NO 2 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-927-2 



Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVA7\ATNRWAPTPGPGTPAERHADGL 60 

I I I I I M I I I I I I I I I I I I II I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 MGAIJVRALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRW 60 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

121 TPHS YIDTYFDTERS STYRSKGFDVTVKYTQGSWTGFVGEDLVTI PKGFNTS FLVNIATI 180 

I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

121 TPHSYIDTYFDTERS STYRSKGFDVTVKYTQGSWTGFVGEDLVTI PKGFNTS FLVNIATI 18 0 

181 FESENFFLPGIKWNGILGIAYATLAKPSSSLETFFDSLWQANIPNVFSMQMCGAGLPVA 240 

" I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 



Qy 

Db 



241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



Qy 

Db 



301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I II I II I I I I I I I I II I I I I I I I I I I I II 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 



Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALM S VC G 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I I I I I I I I 
Db 421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

Qy 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



APPLICANT: 
APPLICANT : 
APPLICANT: 
APPLICANT: 



RESULT 2 
US-09-795-847-2 

Sequence 2, Application US/09795847 
Patent No. US20010018208A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 

Bienkows ki , Michael J . 
Heinrikson, Robert L. 
Parodi, Luis A. 
Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER 1 S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND 

TITLEri50F INVENTION: USES 
TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280DE 
CURRENT APPLICATION NUMBER: US/ 09/7 95, 847 
CURRENT FILING DATE: 2001-02-28 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patent In Ver. 2.0 
SEQ ID NO 2 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-795-847-2 



Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 



1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

| M I I I I I I I I I I I I I M I I I I I I II I I I I I I I II I I I I I II I I I I I I I I M I I I I I I I I 
1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 



Qy 



61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 61 ALAL E P ALAS P AGAAN FLAMVDN LQ GD S G RG Y Y L EML I GT P P Q K LQ I L VDT G S S N FAVAG 12 0 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLWQANIPNVFSMQMCGAGLPVA 240 

I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I! II I M 
Db 181 FES ENFFLPGI KWNGI LGLAYATLAKP S S S LET FFD S LVTQANI PNVFSMQMCGAGL PVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I II I I I I I I I I 

Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

Qy 421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 RAQKRVG FAAS P CAE I AGAAVS E I S GP F ST EDVAS N CVPAQ S L S E P I LW I VS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 3 
US-09-794-743-2 

Sequence 2, Application US/09794743 
Patent No. US20010021391A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 
APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT: Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND 

TITLE OF INVENTION: USES 
TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280BC 
CURRENT APPLICATION NUMBER: US/ 09/7 94 , 7 43 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/208 8 1 
PRIOR FILING DATE: 1999-09-23 



; PRIOR APPLICATION NUMBER: 60/101,594 

; PRIOR FILING DATE: 1998-09-24 

; NUMBER OF SEQ ID NOS : 73 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 2 

LENGTH: 518 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-794-743-2 

Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALLLPLLAQWLLRT^APELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I M I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I M I I M I I I II I I I 

Db 301 IVDSGTTLLRLPQKVFDAWEAV7VRASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRI T I LPQL YI QPMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

I I I I I I I ! I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I 

Db 361 YLRDENS SRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVI GATVMEGFYVI FD 420 

Qy 421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

II I II I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I M I I I I I 

Db 421 RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 4 
US-09-794-748-2 

; Sequence 2, Application US/09794748 

; Patent No. US20020037315A1 

; GENERAL INFORMATION: 

; APPLICANT: Gurney, Mark E. 



APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT: Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND 

TITLE OF INVENTION: USES 
TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280JL 
CURRENT APPLICATION NUMBER: US/ 09/794, 748 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/208 8 1 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 2 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-748-2 

Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGAIARALLLPLIAQWLLR7UVPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

I I I I I I I M I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGALARALLLPLIAQWLLR^PELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 61 ALALEPALAS PAGAAN FLAMVDNLQGDS GRGY YLEMLI GT P PQKLQI LVDTGS SNFAVAG 12 0 

Qy 121 TPHS YI DTYFDTERS STYRS KGFDVTVKYTQGSWTGFVGEDLVTI PKGFNTS FLVNIATI 180 

II | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TPHSYI DTYFDTERS STYRSKGFDVTVKYTQGSWTGFVGEDLVTI PKGFNTS FLVNIATI 18 0 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 24 0 

I I | I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 



Qy 

Db 

Qy 

Db 

Qy 

Db 



361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ SLSEPILWIVS YALMS VC G 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VAS NC VP AQ S L S E P I LW I VS YALMS VC G 480 

481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

481 AIL L VL I VL L L L P F RC Q RR P RD P EWN D E S S L VRH RW K 518 



RESULT 5 
US-09-794-925-2 

; Sequence 2, Application US/09794925 

; Patent No. US20020064819A1 

; GENERAL INFORMATION: 

; APPLICANT: Gurney, Mark E. 

Bienkowski, Michael J. 
Heinrikson, Robert L. 
Parodi, Luis A. 



; APPLICANT: 

; APPLICANT: 

; APPLICANT: 

; APPLICANT: 

; TITLE OF INVENTION: 



Yan, Riqiang 

ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 



AND USES 

TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280HI 
CURRENT APPLICATION NUMBER: US/ 09/7 94 , 925 
CURRENT FILING DATE: 2001-02-27 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 



PRIOR APPLICATION NUMBER 



PRIOR FILING DATE: 1999-09-23 



PRIOR APPLICATION NUMBER 



PRIOR FILING DATE 



PRIOR APPLICATION NUMBER 



PRIOR FILING DATE: 1999-09-23 



60/155,493 



09/404, 133 



1999-09-23 



PCT/US99/20881 



PRIOR APPLICATION NUMBER 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 2 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-794-925-2 



60/101,594 



Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 



1 MGALARALL L P L LAQWL LRAAP E LAP AP FT L P L RVAAATNRWAP T P G P GT P AE RHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MGALARALL L P LLAQWL LRAAP ELAP AP FT L P L RVAAATN RWAP T P GP GT P AE RHAD G L 60 

61 ALALEPAIASPAGAANFLAMVT>NLQGDSGRGYYLEMLIGTPPQK 120 
I I I I I I I I I I II I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



61 ALALEPALAS PAGAAN FLAMVDNLQGDSGRGYYLEMLI GT P PQKLQI LVDTGS SN FAVAG 120 



Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLWIATI 180 

Qy 181 FESENFFLPGIKWNGILG]^YATLAKPSSSLETFFDSLVTQ7\NIPNVFSMQMCGAGLPVA 240 

I I M M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I II I II I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRI T I LPQL YI QPMMGAGLN YEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

I II I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 361 YLRDENS SRS FRIT I LPQLYIQPMMGAGLNYECYRFGISP STNALVI GATVMEGFYVI FD 420 

Qy 421 RAQ K RVG FAAS P CAE I AGAAVS E I S GP FS T EDVAS NCVP AQ S L S E P I LWI VS YALMS VC G 480 

I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I 

Db 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 6 

US-09-215-450-19 

; Sequence 19, Application US/09215450 

; Patent No. US20020068278A1 

; GENERAL INFORMATION: 

; APPLICANT: Giese, Klaus 

; APPLICANT: Xin, Hong 

; TITLE OF INVENTION: METASTATIC BREAST AND COLON CANCER REGULATED GENES 

; FILE REFERENCE: 1451.100 / 210030.447 

; CURRENT APPLICATION NUMBER: US/09/215, 450 

CURRENT FILING DATE: 1998-12-17 
; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 19 

LENGTH: 518 

TYPE: PRT 
; ORGANI SM : human 
US-09-215-450-19 

Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MG ALARAL L L P L LAQWL L RAAP E LAP AP FT L P L RVAAATN R WAP T P G P GT P AE RHAD GL 60 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MGALARAL LLP LLAQ W L L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GL 60 



Qy 


61 


AiALEPALASPAGAAN r J_AMvDNJjyGDoGKl^l I LirjiMljUjl lrryj\_y-Llj VU1 Vjooinj r/W-rtx? 


1 90 

J_ \J 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


61 


ALALEPALASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 


120 


Qy 


121 


TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGbWlGFVGEDLVI IPKGr JM J. or liVJNlAi x 


1 ft n 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 


180 


Qy 


181 


FE S EN FFL P G I KWNG I L GLAYAT LAK P S S S L ET F FD S LVT QAN I PNVF S MQMC GAGL P VA 


_ 4 u 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 




Db 


181 


FESENFFLPGI KWNGI LGLAYATLAKP S S SLET FFDS LVTQANI PNVFSMQMCGAGLPVA 


240 


Qy 


241 


GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLE^ 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 


300 


Qy 


301 


IVDSGTTLLRLPQKVFDAVVEAVAJ^SLIPEFSDGFWTGSyLACWl PJ\lol 


^ £ n 




I I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


IVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 


360 


Qy 


361 


YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGI SPSTNALV1GA1 VM_Cj_ x VI t L> 


/ion 




I I 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 


420 


Qy 


421 


RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 


480 




I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


RAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCG 


480 


Qy 


481 


AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


AILLVLIVLLLLPFRCQRRPRDPEVWDESSLVRHRWK 518 





RESULT 7 
US-09-681-442-2 

Sequence 2, Application US/09681442 
Patent No. US20020081634A1 
GENERAL INFORMATION: 
APPLICANT: Gurney, Mark E. 
APPLICANT: Bienkowski, Michael J. 
APPLICANT: Heinrikson, Robert L. 
APPLICANT: Parodi, Luis A. 
APPLICANT : Yan, Riqiang 

TITLE OF INVENTION: ALZHEIMER'S DISEASE SECRETASE, APP SUBSTRATES THEREFOR, 
AND USES 

TITLE OF INVENTION: THEREFOR 
FILE REFERENCE: 28341/6280FG 
CURRENT APPLICATION NUMBER: US/09/681, 442 
CURRENT FILING DATE: 2001-04-05 
PRIOR APPLICATION NUMBER: 09/416,901 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: 60/155,493 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 09/404,133 
PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: PCT/US99/20881 



PRIOR FILING DATE: 1999-09-23 
PRIOR APPLICATION NUMBER: 60/101,594 
PRIOR FILING DATE: 1998-09-24 
NUMBER OF SEQ ID NOS : 73 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 2 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-681-442-2 

Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MGALARALLLPLIAQWLLRAAPEIAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADG^ 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I M I I I I I I 1 I I 



120 
120 
180 
180 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



300 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 



I I I I I I I I M I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I 



420 
420 
480 
480 



II I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 



RESULT 8 

US-09-978-295A-196 

; Sequence 196, Application US/09978295A 
; Patent No. US20020156006A1 
; GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Ashkenazi, Avi 
Baker Kevin P. 
Botstein, David 
Desnoyers, Luc 
Eaton, Dan 
Ferrara, Napoleon 
Filvaroff, Ellen 
Fong, Sherman 
Gao, Wei-Qiang 
Gerber, Hanspeter 
Gerritsen, Mary E. 
Goddard, Audrey 
Godowski, Paul J. 
Grimaldi, J. Christopher 
Gurney, Austin L. 
Hillan, Kenneth J 
Kljavin, Ivar J. 
Kuo, Sophia S. 
Napier, Mary A. 
Pan, James; 
Paoni, Nicholas F. 
Roy, Margaret Ann 
Shelton, David L. 
Stewart, Timothy A. 
Tumas, Daniel 
Williams, P. Mickey 



Wood, William I . 

TITLE OF INVENTION: Secreted and Transmembrane Polypeptides and Nucleic 
TITLE OF INVENTION: Acids Encoding the Same 
FILE REFERENCE: P2630P1C11 

CURRENT APPLICATION NUMBER: US/09/978, 295A 

CURRENT FILING DATE: 2001-10-15 

PRIOR APPLICATION NUMBER: 09/918585 

PRIOR FILING DATE: 2001-07-30 

PRIOR APPLICATION NUMBER: 60/062250 

PRIOR FILING DATE: 1997-10-17 

PRIOR APPLICATION NUMBER: 60/064249 " 

PRIOR FILING DATE: 1997-11-03 

PRIOR APPLICATION NUMBER: 60/065311 

PRIOR FILING DATE: 1997-11-13 

PRIOR APPLICATION NUMBER: 60/066364 

PRIOR FILING DATE: 1997-11-21 

PRIOR APPLICATION NUMBER: 60/077450 

PRIOR FILING DATE: 1998-03-10 

PRIOR APPLICATION NUMBER: 60/077632 

PRIOR FILING DATE: 1998-03-11 

PRIOR APPLICATION NUMBER: 60/077641 

PRIOR FILING DATE: 1998-03-11 

PRIOR APPLICATION NUMBER: 60/07764 9 

PRIOR FILING DATE: 1998-03-11 

PRIOR APPLICATION NUMBER: 60/077791 

PRIOR FILING DATE: 1998-03-12 

PRIOR APPLICATION NUMBER: 60/078004 

PRIOR FILING DATE : 1998-03-13 

PRIOR APPLICATION NUMBER: 60/078886 

PRIOR FILING DATE: 1998-03-20 

PRIOR APPLICATION NUMBER: 60/078936 



PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/078910 
PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/078939 
PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/079294 
PRIOR FILING DATE: 1998-03-25 
PRIOR APPLICATION NUMBER: 60/079656 
PRIOR FILING DATE: 1998-03-26 
PRIOR APPLICATION NUMBER: 60/079664 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079689 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079663 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079728 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079786 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079920 
PRIOR FILING DATE: 1998-03-30 
PRIOR APPLICATION NUMBER: 60/079923 
PRIOR FILING DATE: 1998-03-30 
PRIOR APPLICATION NUMBER: 60/080105 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080107 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080165 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080194 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080327 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/080328 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/080333 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/080334 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/081070 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081049 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081071 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081195 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081203 
PRIOR FILING DATE: 1998-04-09 
PRIOR APPLICATION NUMBER: 60/081229 
PRIOR FILING DATE: 1998-04-09 
PRIOR APPLICATION NUMBER: 60/081955 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081817 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081819 
PRIOR FILING DATE: 1998-04-15 



PRIOR APPLICATION NUMBER: 60/081952 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081838 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/082568 
PRIOR FILING DATE: 1998-04-21 
PRIOR APPLICATION NUMBER: 60/082569 
PRIOR FILING DATE: 1998-04-21 
PRIOR APPLICATION NUMBER: 60/082704 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082804 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082700 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082797 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082796 
PRIOR FILING DATE: 1998-04-23 
PRIOR APPLICATION NUMBER: 60/083336 
PRIOR FILING DATE: 1998-04-27 
PRIOR APPLICATION NUMBER: 60/083322 
PRIOR FILING DATE: 1998-04-28 
PRIOR APPLICATION NUMBER: 60/083392 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083495 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083496 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083499 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083545 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083554 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083558 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083559 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083500 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083742 
PRIOR FILING DATE: 1998-04-30 
PRIOR APPLICATION NUMBER: 60/084366 
PRIOR FILING DATE: 1998-05-05 
PRIOR APPLICATION NUMBER: 60/084414 
PRIOR FILING DATE: 1998-05-06 
PRIOR APPLICATION NUMBER: 60/084441 
PRIOR FILING DATE: 1998-05-06 
PRIOR APPLICATION NUMBER: 60/084 637 
PRIOR FILING DATE: 1998-05-07 
PRIOR APPLICATION NUMBER: 60/084639 
PRIOR FILING DATE: 1998-05-07 
PRIOR APPLICATION NUMBER: 60/084640 
PRIOR FILING DATE: 1998-05-07 
PRIOR APPLICATION NUMBER: 60/084598 
PRIOR FILING DATE: 1998-05-07 
PRIOR APPLICATION NUMBER: 60/084600 



PRIOR FILING DATE: 1998-05-^07 
PRIOR APPLICATION NUMBER: 60/084627 
PRIOR FILING DATE: 1998-05-07 
PRIOR APPLICATION NUMBER: 60/084643 
PRIOR FILING DATE: 1998-05-07 
PRIOR APPLICATION NUMBER: 60/085339 
PRIOR FILING DATE: 1998-05-13 
PRIOR APPLICATION NUMBER: 60/085338 
PRIOR FILING DATE: 1998-05-13 
PRIOR APPLICATION NUMBER: 60/085323 
PRIOR FILING DATE: 1998-05-13 
PRIOR APPLICATION NUMBER: 60/085582 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085700 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085689 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085579 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085580 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085573 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085704 
PRIOR FILING DATE: 1998-05-15 
PRIOR APPLICATION NUMBER: 60/085697 



Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


1 


MGALARAL LL P LLAQWL LRAAP E LAP AP FT L P LRVAAATN RWAPT P G P GT P AE RHAD GL 

I I I I I I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M II 1 1 

MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 


60 


Db 


1 


60 


Qy 


61 


ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 

I I M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ALAL E P ALAS P AGAAN FLAMVDN LQ GD S GRG YYL EML I GT P P Q KLQ I LVDT G S S N FAVAG 


120 


Db 


61 


120 


Qy 


121 


TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 

| | | I I I I I I I I I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 1 1 1 1 1 
TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 


180 


Db 


121 


180 


Qy 


181 


FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 

I I 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 


240 


Db 


181 


240 


Qy 


241 


GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 

| M | | | | | | | | | 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 


300 


Db 


241 


300 


Qy 


301 


IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 

I I I I 1 1 1 II II 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 

IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 


360 


Db 


301 


360 


Qy 


361 


YLRDENS S RS FRIT I LPQLYI QPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FD 
| | | | | | 1 1 1 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


420 



D b 361 YLRDENS S RS FRI T I LPQL YI QPMMGAGLNYECYRFGI S P STNALVT GATVMEGFYVI FD 420 

Qy 421 RAQKRVG FAAS P CAEI AGAAVS E I S GP FS T EDVASNCVPAQ S LS E P I LWI VS YALMS VCG 480 

| I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M 
Db 421 RAQK RVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N CVP AQ S L S E P I LW I VS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I II 
D b 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 9 
US-09-886-143-2 

; Sequence 2, Application US/09886143 

; Patent No. US20020159991A1 

; GENERAL INFORMATION: 

; APPLICANT: Cordell, Barbara 

; APPLICANT: Schimmoller, Frauke 

; APPLICANT: Liu, Yu-Wang 

; APPLICANT: Quon, Diana Horn 

; TITLE OF INVENTION: Modulation of A Levels by 
; TITLE OF INVENTION: Secretase BACE2 
; FILE REFERENCE: SCIOS.022A 

; CURRENT APPLICATION NUMBER: US/09/886, 143 

; CURRENT FILING DATE: 2001-06-20 

; PRIOR APPLICATION NUMBER: 60/215,729 

; PRIOR FILING DATE: 2000-06-28 

; NUMBER OF SEQ ID NOS: 6 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 
; LENGTH: 518 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-886-143-2 

Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

1 MGALARALL L P LLAQWLL RAAP E LAP AP FT LP L RVAAATN RWAP T P G P GT P AERHAD G L 60 

| M I I II I I II I M I I I I I II I II I I I I I I I II I I I I I I I I I I I I I I I II I I I M I I I I I 
1 MGALARAL L L P L LAQW L L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD G L 60 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

| | | | | | | | I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I 1 I I M I I I I I I I I I I I I 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

| | | M | | | I I I I I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I I M I II I I I I I I M 

121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

| | | | | | I I I I I I I I I I I I II I I I I I I I I II I II I I I I I I I I I I I I M I I I I I I II I II I I 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 
| I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



Db 



241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

| I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I M I I I I I I I I I I I I M I M I I I 
D b 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTN5ETPWSYFPKISI 360 

Qy 361 YLRDENS S RS FRI T I L PQLYI QPMMGAGLNYEC YRFGI S P STNALVI GATVMEGFYVI FD 420 

Ml | | | | | I I I I I I I I M I I I I II I I I I I I I I I I I I I I II I I I I I I I I II I I I I I 

Db 361 YLRDENS SRS FRIT I LPQLYIQPMMGAGLNYECYRFGISP STNALVI GATVMEGFYVI FD 420 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS NCVP AQ S L S E P I LW I VS YALMS VCG 480 

I | | | | | | | | | | I I I I I I I II I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVASNC VP AQ SLSEPILWI VS YALMS VCG 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 4 81 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 10 
US-09-978-697-196 

Sequence 196, Application US/09978697 
Patent No. US20020169284A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Ashkenazi, Avi 
Baker Kevin P. 
Botstein, David 
Desnoyers, Luc 
Eaton, Dan 
Ferrara, Napoleon 
Filvaroff, Ellen 
Fong, Sherman 
Gao, Wei-Qiang 
Gerber, Hanspeter 
Gerritsen, Mary E. 
Goddard, Audrey 
Godowski, Paul J. 
Grimaldi, J. Christopher 
Gurney, Austin L. 
Hillan, Kenneth J 
Kljavin, Ivar J. 
Kuo, Sophia S. 
Napier, Mary A. 
Pan, James; 
Paoni, Nicholas F. 
Roy, Margaret Ann 
Shelton, David L. 
Stewart, Timothy A. 
Tumas, Daniel 
Williams, P. Mickey 



Wood, William I. 

TITLE OF INVENTION: Secreted and Transmembrane Polypeptides and Nucleic 
TITLE OF INVENTION: Acids Encoding the Same 
FILE REFERENCE: P2630P1C27 

CURRENT APPLICATION NUMBER: US/ 09/ 978 , 697 
CURRENT FILING DATE: 2001-10-16 
PRIOR APPLICATION NUMBER: 09/918585 



PRIOR FILING DATE: 2001-07-30 
PRIOR APPLICATION NUMBER: 60/062250 
PRIOR FILING DATE: 1997-10-17 
PRIOR APPLICATION NUMBER: 60/064249 
PRIOR FILING DATE: 1997-11-03 
PRIOR APPLICATION NUMBER: 60/065311 
PRIOR FILING DATE: 1997-11-13 
PRIOR APPLICATION NUMBER: 60/066364 
PRIOR FILING DATE: 1997-11-21 
PRIOR APPLICATION NUMBER: 60/077450 
PRIOR FILING DATE: 1998-03-10 
PRIOR APPLICATION NUMBER: 60/077632 
PRIOR FILING DATE: 1998-03-11 
PRIOR APPLICATION NUMBER: 60/077641 
PRIOR FILING DATE: 1998-03-11 
PRIOR APPLICATION NUMBER: 60/077649 
PRIOR FILING DATE: -1998-03-11 
PRIOR APPLICATION NUMBER: 60/077791 
PRIOR FILING DATE: 1998-03-12 
PRIOR APPLICATION NUMBER: 60/078004 
PRIOR FILING DATE: 1998-03-13 
PRIOR APPLICATION NUMBER: 60/078886 
PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/078936 
PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/078910 
PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/078939 
PRIOR FILING DATE: 1998-03-20 
PRIOR APPLICATION NUMBER: 60/079294 
PRIOR FILING DATE: 1998-03-25 
PRIOR APPLICATION NUMBER: 60/079656 
PRIOR FILING DATE: 1998-03-26 
PRIOR APPLICATION NUMBER: 60/079664 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079689 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079663 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079728 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079786 
PRIOR FILING DATE: 1998-03-27 
PRIOR APPLICATION NUMBER: 60/079920 
PRIOR FILING DATE: 1998-03-30 
PRIOR APPLICATION NUMBER: 60/079923 
PRIOR FILING DATE: 1998-03-30 
PRIOR APPLICATION NUMBER: 60/080105 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080107 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080165 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080194 
PRIOR FILING DATE: 1998-03-31 
PRIOR APPLICATION NUMBER: 60/080327 
PRIOR FILING DATE: 1998-04-01 



PRIOR APPLICATION NUMBER: 60/080328 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/080333 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/080334 
PRIOR FILING DATE: 1998-04-01 
PRIOR APPLICATION NUMBER: 60/081070 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081049 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081071 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081195 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081203 
PRIOR FILING DATE: 1998-04-09 
PRIOR APPLICATION NUMBER: 60/081229 
PRIOR FILING DATE: 1998-04-09 
PRIOR APPLICATION NUMBER: 60/081955 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081817 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081819 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081952 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081838 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/082568 
PRIOR FILING DATE: 1998-04-21 
PRIOR APPLICATION NUMBER: 60/082569 
PRIOR FILING DATE: 1998-04-21 
PRIOR APPLICATION NUMBER: 60/082704 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082804 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082700 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082797 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082796 
PRIOR FILING DATE: 1998-04-23 
PRIOR APPLICATION NUMBER: 60/083336 
PRIOR FILING DATE: 1998-04-27 
PRIOR APPLICATION NUMBER: 60/083322 
PRIOR FILING DATE: 1998-04-28 
PRIOR APPLICATION NUMBER: 60/083392 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083495 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083496 
PRIOR FILING DATE: 1998-04-29 
PRIOR ( APPLICATION NUMBER: 60/083499 
PRIOR' FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083545 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083554 



PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 
PRIOR FILING DATE 
PRIOR APPLICATION 



: 1998-04-29 
NUMBER: 60/083558 
: 1998-04-29 
NUMBER : 60/083559 
: 1998-04-29 
NUMBER: 60/083500 
: 1998-04-29 
NUMBER: 60/083742 
: 1998-04-30 
NUMBER: 60/084366 
: 1998-05-05 
NUMBER: 60/084414 
: 1998-05-06 
NUMBER: 60/084441 
: 1998-05-06 
NUMBER: 60/084637 
: 1998-05-07 
NUMBER: 60/084639 
: 1998-05-07 
NUMBER: 60/084640 
: 1998-05-07 

NUMBER: 60/084598 
: 1998-05-07 
NUMBER: 60/084600 
: 1998-5-07 
NUMBER: 60/084627 
: 1998-05-07 
NUMBER: 60/084643 
: 1998-05-07 
NUMBER: 60/085339 
: 1998-05-13 
NUMBER: 60/085338 
: 1998-05-13 

NUMBER: 60/085323 
: 1998-05-13 
NUMBER: 60/085582 
: 1998-05-15 
NUMBER: 60/085700 
: 1998-05-15 
NUMBER: 60/085689 
: 1998-05-15 
NUMBER: 60/085579 
: 1998-05-15 
NUMBER: 60/085580 
: 1998-05-15 
NUMBER: 60/085573 
: 1998-05-15 
NUMBER: 60/085704 
: 1998-05-15 
NUMBER: 60/085697 



Query Match 100.0%; Score 2687; DB 9; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 

1 MGALARALLLPLIAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M M I I I M 



Db 



1 MGALARAL LL P L LAQWL L RAAP ELAP AP FT L P L RVAAATN RWAPT P G P GT P AE RHAD GL 60 



Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 




Qy 


421 


Db 


421 


Qy 


481 


Db 


481 



I I I M I I I I I I I I I I I I I I I I I I I I M I I II I I I I I II II II I I I II I I I M I I I I I I M 



I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I 



FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 24 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 



I || I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 



IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M 

IVDSGTTLLRLPQKVFDAVVEAVT^RASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 



I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I M I 
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Qy 1 MGALARAL LL PL LAQWLL RAAP ELAP AP FT LP LRVAAATN R WAP T P GP GT P AERHADGL 60 

I I I I I I I I I I I M I I I I I I I I I [ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MGALARAL L L P L LAQ WL L RAAP ELAPAP FT L P L RVAAATN RWAP T P G P GT PAE RHADGL 60 

Qy 61 ALALEPALAS PAGAAN FLAMVDNLQGDS GRG YYLEMLI GT P PQKLQI LVDT GS SNFAVAG 12 0 

I I I I I I I I I I I I I I I I I I I I M I I I ! M I I I I I M I I I I I I I I I II M I I I I I I I I I I I I 
Db 61 ALALEPALAS PAGA7\NFLAMVX)NLQGDSGRGYYLEMLIGTP PQKLQI LVDT GS SNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I M I II I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I M I 1 I I I I I I I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I 
Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I M I I II I M I II I I I II I I I I I I I I I I I I I I I I I I I I I I 1 I I I M I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I II I I I II I I I I I I I M M II I I M I I M I I I I I M I I I M I I I I I I I I I I I I I I I I M 

Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I II I M I I 1 II II I I I II M I II I II I I I I II II I I I I I I I I I I I I I I I I 

Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 



Qy 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ SLSEPILWIVS YALMS VC G 4 80 

M I i I II M M I II M I I I II M 11 II I I I I M I M I I II II M I II I t M M I M M M 

Db 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFST ED VAS N C VP AQ SLSEPILWIVS Y ALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I II I I I I M I II I I M I I I I I I I I I I I I I I 
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: Baker Kevin P. 


APPLICANT , 


: Botstein, David 


APPLICANT' 


: Desnoyers, Luc 


APPLICANT: 


: Eaton, Dan 


APPLICANT 


: Ferrara, Napoleon 


APPLICANT' 


: Filvaroff, Ellen 


APPLICANT; 


; Fong, Sherman 


APPLICANT 


: Gao, Wei-Qiang 


APPLICANT' 


: Gerber, Hanspeter 


APPLICANT: 


: Gerritsen, Mary E. 


APPLICANT 


: Goddard, Audrey 


APPLICANT' 


: Godowski, Paul J. 


APPLICANT: 


: Grimaldi, J. Christopher 


APPLICANT 


: Gurney, Austin L. 


APPLICANT' 


: Hillan, Kenneth J 


APPLICANT , 


: Kljavin, Ivar J. 


APPLICANT 


: Kuo, Sophia S. 


APPLICANT' 


: Napier, Mary A. 


APPLICANT , 


: Pan, James; 


APPLICANT 


: Paoni, Nicholas F. 


APPLICANT' 


: Roy, Margaret Ann 


APPLICANT. 


: Shelton, David L. 


APPLICANT 


: Stewart, Timothy A. 


APPLICANT 


: Tumas, Daniel 


APPLICANT , 


: Williams, P. Mickey 


APPLICANT 


: Wood, William I . 



TITLE OF INVENTION: Secreted and Transmembrane Polypeptides and Nucleic 
TITLE OF INVENTION: Acids Encoding the Same 
FILE REFERENCE: P2630P1C7 

CURRENT APPLICATION NUMBER: US/09/978 , 189 
CURRENT FILING DATE: 2001-10-15 
PRIOR APPLICATION NUMBER: 09/918585 
PRIOR FILING DATE: 2001-07-30 
PRIOR APPLICATION NUMBER: 60/062250 
PRIOR FILING DATE: 1997-10-17 
PRIOR APPLICATION NUMBER: 60/064249 
PRIOR FILING DATE: 1997-11-03 
PRIOR APPLICATION NUMBER: 60/065311 
PRIOR FILING DATE: 1997-11-13 
PRIOR APPLICATION NUMBER: 60/066364 
PRIOR FILING DATE: 1997-11-21 



; PRIOR APPLICATION 

; PRIOR FILING DATE: 

PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE; 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

PRIOR FILING DATE: 

; PRIOR APPLICATION 

PRIOR FILING DATE: 

; PRIOR APPLICATION 

PRIOR FILING DATE: 

; PRIOR APPLICATION 

PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 

; PRIOR FILING DATE: 

; PRIOR APPLICATION 



NUMBER: 60/077450 

1998-03-10 
NUMBER: 60/077632 

1998-03-11 
NUMBER: 60/077641 

1998-03-11 
NUMBER: 60/077649 

1998-03-11 
NUMBER: 60/077791 

1998-03-12 
NUMBER: 60/078004 

1998-03-13 
NUMBER: 60/078886 

1998-03-20 
NUMBER: 60/078936 

1998-03-20 
NUMBER: 60/078910 

1998-03-20 
NUMBER: 60/078939 

1998-03-20 
NUMBER: 60/079294 

1998-03-25 
NUMBER: 60/079656 

1998-03-26 
NUMBER: 60/079664 

1998-03-27 
NUMBER: 60/079689 

1998-03-27 
NUMBER: 60/079663 

1998-03-27 
NUMBER: 60/079728 

1998-03-27 
NUMBER: 60/079786 

1998-03-27 
NUMBER: 60/079920 

1998-03-30 
NUMBER: 60/079923 

1998-03-30 
NUMBER : 60/080105 

1998-03-31 
NUMBER: 60/080107 

1998-03-31 
NUMBER: 60/080165 

1998-03-31 
NUMBER: 60/080194 

1998-03-31 
NUMBER: 60/080327 

1998-04-01 
NUMBER: 60/080328 

1998-04-01 
NUMBER: 60/080333 

1998-04-01 
NUMBER: 60/080334 

1998-04-01 
NUMBER: 60/081070 

1998-04-08 
NUMBER: 60/081049 



PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081071 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081195 
PRIOR FILING DATE: 1998-04-08 
PRIOR APPLICATION NUMBER: 60/081203 
PRIOR FILING DATE: 1998-04-09 
PRIOR APPLICATION NUMBER: 60/081229 
PRIOR FILING DATE: 1998-04-09 
PRIOR APPLICATION NUMBER: 60/081955 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081817 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081819 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081952 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/081838 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: 60/082568 
PRIOR FILING DATE: 1998-04-21 
PRIOR APPLICATION NUMBER: 60/082569 
PRIOR FILING DATE: 1998-04-21 
PRIOR APPLICATION NUMBER: 60/082704 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082804 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082700 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082797 
PRIOR FILING DATE: 1998-04-22 
PRIOR APPLICATION NUMBER: 60/082796 
PRIOR FILING DATE: 1998-04-23 
PRIOR APPLICATION NUMBER: 60/083336 
PRIOR FILING DATE: 1998-04-27 
PRIOR APPLICATION NUMBER: 60/083322 
PRIOR FILING DATE: 1998-04-28 
PRIOR APPLICATION NUMBER: 60/083392 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083495 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083496 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083499 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083545 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083554 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083558 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083559 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083500 
PRIOR FILING DATE: 1998-04-29 
PRIOR APPLICATION NUMBER: 60/083742 
PRIOR FILING DATE: 1998-04-30 



PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 

PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 

PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 
; PRIOR FILING DATE: 

PRIOR APPLICATION 
; PRIOR FILING DATE: 

PRIOR APPLICATION 
; PRIOR FILING DATE: 

PRIOR APPLICATION 
; PRIOR FILING DATE: 

PRIOR APPLICATION 
; PRIOR FILING DATE: 
; PRIOR APPLICATION 

PRIOR FILING DATE: 

PRIOR APPLICATION 
; PRIOR FILING DATE: 

PRIOR APPLICATION 



NUMBER: 60/084366 

1998-05-05 
NUMBER: 60/084414 

1998-05-06 
NUMBER: 60/084441 

1998-05-06 
NUMBER: 60/084637 

1998-05-07 
NUMBER: 60/084639 

1998-05-07 
NUMBER: 60/084640 

1998-05-07 
NUMBER: 60/084598 

1998-05-07 
NUMBER: 60/084600 

1998-5-07 
NUMBER: 60/084627 

1998-05-07 
NUMBER: 60/084643 

1998-05-07 
NUMBER: 60/085339 

1998-05-13 
NUMBER: 60/085338 

1998-05-13 
NUMBER: 60/085323 

1998-05-13 
NUMBER: 60/085582 

1998-05-15 
NUMBER: 60/085700 

1998-05-15 
NUMBER: 60/085689 

1998-05-15 
NUMBER: 60/085579 

1998-05-15 
NUMBER: 60/085580 

1998-05-15 
NUMBER: 60/085573 

1998-05-15 
NUMBER: 60/085704 

1998-05-15 
NUMBER: 60/085697 



Query Match 100.0%; Score 2687; DB 10; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARALLLPLLAQWLLRAAPEIAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

II I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I 

Db 1 MGALARALLLPLIAQWLLRAAPELAPAPFTLPLRVAAATNRVVAPTPGPGTPAERHADGL 60 

Qy 61 ALALEPALASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I I I I I I I I II I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I M I I II I I I I 

Db 61 ALALEPALASPAGAANFL7\MVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLA/TIPKGFNTSFLVNIATI 180 

I I I I I M I I I I I I I II II I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I II I I II I I I I 

Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 



Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 



FESENFFLPGIKWNGILGLAYATL7VKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 24 0 

| | | | | | | M M I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I M I I I I 
FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

M | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



| | | M I I I I I I I I I M I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I 



YLRDEN SSRSFRITI LPQLYIQ PMMGAGLN YEC YRFGI S P STNALVI GAT VMEGFYVI FD 42 0 

| | | | | | | | | | | | I I I I I I I II I I I I I I I I II I I I I I M I I M I I I I I I I I I I Ml 

YLRDENSSRS FRIT I LPQLYIQPMMGAGLNYECYRFGISP STNALVI GATVMEGFYVIFD 420 

RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS N CVP AQ S L S E P I LWI VS YALMS VC G 480 

| | | | | | | | | | I I II I I I I I I I M I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I 

RAQ KRVG FAAS P CAE I AGAAVS EISGPFST EDVAS N CVP AQ S L S E P I LWI VS YALMS VC G 480 

AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I | | I I II I I I I II I I I I I I I I I I I I I M I I I I I I I I I I 
AI LLVLI VLLLLPFRCQRRPRDPEWNDES SLVRHRWK 518 



RESULT 14 

US-09-97 8-608A-196 

Sequence 196, Application US/09978608A 
Publication No. US20030045462A1 
GENERAL INFORMATION: 



APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 

APPLICANT 



Ashkenazi, Avi 
Baker Kevin P. 
Botstein, David 
Desnoyers, Luc 
Eaton, Dan 
Ferrara, Napoleon 
Filvaroff, Ellen 
Fong, Sherman 
Gao, Wei-Qiang 
Gerber, Hanspeter 
Gerritsen, Mary E. 
Goddard, Audrey 
Godowski, Paul J. 
Grimaldi, J. Christopher 
Gurney, Austin L. 
Hillan, Kenneth J 
Kljavin, Ivar J. 
Kuo, Sophia S. 
Napier, Mary A. 
Pan, James; 
Paoni, Nicholas F. 
Roy, Margaret Ann 
Shelton, David L. 
Stewart, Timothy A. 
Tumas, Daniel 
Williams, P. Mickey 



APPLICANT: Wood, William I. 

TITLE OF INVENTION; Secreted and Transmembrane Polypeptides and Nucleic 
TITLE OF INVENTION: Acids Encoding the Same 
FILE REFERENCE: P2630P1C22 

CURRENT APPLICATION NUMBER: US/09/978 , 608A 
CURRENT FILING DATE: 2001-10-16 
NUMBER OF SEQ ID NOS : 624 
Prior Application removed - See File Wrapper or Palm 
SEQ ID NO 196 
LENGTH: 518 
TYPE: PRT 

ORGANISM: Homo sapien 
US-09-978-608A-196 

Query Match 100.0%; Score 2687; DB 10; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MGALARAL L L P L LAQWL L RAAP E LAP AP FT L P L RVAAAT N RWAP T P G P GT P AE RHAD G L 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1 MGALARAL L L P L LAQWL L RAAP E LAP AP FT L P L RVAAAT N RWAP T P G P GT P AERHAD G L 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I II I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I II 
Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I II I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 18 0 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I II I M I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I J I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 I VD S GTT LLRL P Q KVFDAWEAVARAS LIPEFSDG FWT G SQ LACWTN S ET P WS YFP KI S I 360 

I I II I I I I I I I I I I I I I I I I II II I II I M I I I I I I M I I I I I I I I II I I I I I I I I I I I I 

Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I I I I I I I I M II M I I I I II I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 YLRDENS SRS FRI TI LPQL YIQPMMGAGLN YEC YRFGI S PSTNALVIGATVMEGFYVI FD 420 

Qy 421 RAQKRVG FAAS P CAE I AGAAVS EISGPFSTE DVASNC VPAQS L S E P I LWI VS YALMS VC G 480 

I I I II I I I I I I II I I I I I i I I I I I I I I I I I I I I I II I I I I I I I I I II I I II I I I I I I I I I 

Db 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE DVAS NC VP AQ SLSEPILWIVS YALMS VC G 480 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 15 

US-09-978-585A-196 



Sequence 196, Application US/09978585A 
Publication No. US20030049633A1 
GENERAL INFORMATION: 



APPLICAN 1 


: Ashkenazi, Avi 


TV T~\ T T 7\ >Tm 

APPLICANT 


: Baker Kevin P. 


APPLICANT , 


: Botstein, David 


"TV T^i T T Z"*' TV >Tm 

APPLICANT 


: Desnoyers, Luc 


APPLICANT ; 


: Eaton, Dan 


APPLICANT: 


: Ferrara, Napoleon 


-p\ ■¥— V T - V -f T A - ^ *T» *fc Till 

APPLICANT : 


: Filvaroff, Ellen 


APPLICANT; 


: Fong, Sherman 


TV T~\ "TS T" T* TV UTim 

APPLICANT : 


: Gao, Wei-Qiang 


APPLICANT: 


: Gerber, Hanspeter 


APPLICANT: 


: Gerritsen, Mary E. 


APPLICANT: 


Goddard, Audrey 


APPLICANT : 


: Godowski, Paul J. 


APPLICANT: 


: Grimaldi, J. Christopher 


APPLICANT : 


: Gurney, Austin L. 


APPLICANT: 


: Hillan, Kenneth J 


APPLICANT: 


Kljavin, Ivar J. 


APPLICANT: 


: Kuo, Sophia S. 


APPLICANT: 


Napier, Mary A. 


APPLICANT: 


: Pan, James; 


APPLICANT: 


Paoni, Nicholas F. 


APPLICANT: 


: Roy, Margaret Ann 


APPLICANT : 


Shelton, David L. 


APPLICANT: 


Stewart, Timothy A. 


APPLICANT: 


Tumas, Daniel 


APPLICANT: 


Williams, P. Mickey 


APPLICANT: 


Wood, William I. 



TITLE OF INVENTION: Secreted and Transmembrane Polypeptides and Nucleic 
; TITLE OF INVENTION: Acids Encoding the Same 
; FILE REFERENCE: P2630P1C15 

; CURRENT APPLICATION NUMBER: US/09/978 , 585A 
; CURRENT FILING DATE: 2001-10-16 
; NUMBER OF SEQ ID NOS : 624 

; Prior Application removed - See File Wrapper or Palm 
; SEQ ID NO 196 
; LENGTH: 518 

TYPE: PRT 
; ORGANISM: Homo sapien 
US-09-978-585A-196 

Query Match 100.0%; Score 2687; DB 10; Length 518; 

Best Local Similarity 100.0%; Pred. No. 2.8e-240; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

I I M I I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 1 MG ALARAL L L P L LAQW L L RAAP E LAP AP FT L P L RVAAATN R WAP T P G P GT P AE RHAD G L 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I M I I I I II I I I I I I I I I II I I I M I I II I M I I II I I I I I li I I I I I I I I I I I I I I I I I 

Db 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 



Db 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I II I M M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I 

Db 181 FESENFFLPGI KWN G I LGLAY AT L7VK PSSSLETFFDS L VTQAN I PN VF SMQMC GAGL P VA 240 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I II I I I I I I I I I I I II M I I I I II I I II I I I I I II I I I I I I I I I I I I I I I I I II I I M 

Db 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I II I I I I I I II I I II I I I I I I I I I I I I I 
Db 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 42 0 

I I I I I I I I I I I II I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 42 0 

Qy 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ SLSEPILWIVS YALMS VC G 480 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I 

Db 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ SLSEPILWIVS Y ALM S VC G 4 80 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

M I I I I I I II I I I I I I II I I I I I I I II I I I I I I I I I I I 
Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



Search completed: March 4, 2004, 15:57:36 
Job time : 59.8617 sees 



GenCore version 5.1,6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on : 



March 4, 2004, 15:28:35 ; Search time 74.3936 Seconds 

(without alignments) 
2196.942 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-668-314C-2 
2687 

1 MGALARALLLPLLAQWLLRA, 



RP RD P EWNDE S S LVRHRWK 518 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



1017041 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



SPTREMBL 25:* 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 



sp_archea : * 
sp_bacteria : * 
sp_f ungi : * 
sp_human: * 
sp_invertebrate : * 
sp__mammal : * 
sp_mhc : * 
sp_organelle : * 
sp_phage: * 

sp_plant : * 

sp_rodent : * 

sp_virus : * 

sp_jvertebrate : * 

sp__unclassif ied: * 

sp_rvirus : * 

sp_bacteriap : * 

sp archeap:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 
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No. 



Score 
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ID 
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17 


386 


1 A 

14 . 


A 

4 


^ A A 

244 


5 


Q8WQY9 


Q8wqy9 aphrocallis 


18 


367 . 5 


13 . 


1 


383 


13 


Q9DEC3 


Q9dec3 xenopus lae 


19 


361 . 5 


13 . 


5 


389 


6 


Q9GMY4 


Q9gmy4 sorex ungui 


20 


355 . 5 


13 . 


2 


384 


13 


Q91322 


Q91322 rana catesb 
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093428 
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36 
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a n 

397 
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Q800a0 rana catesb 


37 
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n n A 

37 8 


13 


Q9PUR9 


Q9pur9 pseudopleur 


38 
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11 . 


8 


392 


11 


Q9D /R / 


Q9d7r7 mus muscuiu 


39 


313 


11 . 
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383 


5 


076856 


076856 dictyosteli 


40 


312.5 


11. 


6 


354 


5 


Q9GYX7 


Q9gyx7 boophilus m 


41 


305 


11. 


4 


384 


13 


Q9DEC2 


Q9dec2 xenopus lae 


42 


302 


11. 


2 


398 


13 


P87370 


P87370 oncorhynchu 


43 
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11. 


2 


401 


11 


Q91X66 


Q91x66 mus muscuiu 


44 
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11. 
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386 


6 
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45 
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11. 


0 


390 


6 


Q9GK10 


Q9gkl0 camelus dro 



ALIGNMENTS 



RESULT 1 
Q8C5E9 

ID Q8C5E9 PRELIMINARY; PRT; 514 AA. 

AC Q8C5E9; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 



DE Beta-site APP-cleaving enzyme 2. 

GN BACE2 . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=l 009 0 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Testis ; 

RX MEDLINE=22354683; PubMed-12466851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . 11 ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK078770; BAC37384.1; 

DR MGD; MGI : 1860440; Bace2 . 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO:0006508; P : proteolysis and peptidolysis; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

SQ SEQUENCE 514 AA; 55811 MW; CBB9237BB68A0B2E CRC64; 

Query Match 89.5%; Score 2405; DB 11; Length 514; 

Best Local Similarity 88.8%; Pred. No. 3.5e-179; 

Matches 4 60; Conservative 20; Mismatches 34; Indels 4; Gaps 1; 
Qy 1 MGA1ARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 





Db 



1 MGALLRALLL P VLAQWLL S AVP ALAPAP FT LP LQVARATNHRAS AVP GLGT P GL PRADGL 60 



QY 



Db 



61 ALALEPALASPAGAANFIAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 12 0 

I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I II I I I I I I I I 

61 ALALEPVRAT ANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQILVDTGSSNFAVAG 116 



Qy 



121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 




Db 



117 APHSYIDTYFDSESSSTYHSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATI 17 6 



Qy 



Db 



181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I M I I I II I I I II I I ! I I I M I I I II I I M I M I II I I : : I I I I I I II I I I I I 

177 FESENFFLPGIKWNGILGLAYAAIAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVA 236 



Qy 



Db 



241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I II I I I I I M M I M II I I I I I I II I I I I I I I I II I I I I II I : I I II I II I II II I 
237 GSGTNGGSLVTGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQNLNLDCREYNADKA 296 



Qy 



301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 





Db 



297 IVDSGTTLLRLPQKVFDAVV^VARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISI 356 



Qy 



361 YLRDENS S RS FRI T I LPQL YI QPMMGAGLN YECYRFGI S P STNALVI GATVMEGFYVI FD 420 
I I II II : I I I I I I II I I II I I II I I II I MINIMI! II II II I I I I II I I I I I : II 



Db 



357 



YLRDENASRSFRITILPQLYIQPMMGAGFNYECYRFGISSSTNALVIGATVMEGFYWFD 416 



Qy 421 RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS Y ALM S VC G 480 

Mhlllll I M I I I I I I I I M M I II I : I I I I I I I I : I : I I I I I i I I I I M i I I I 
Db 417 RAQ RRVG FAVS P CAE I E GTT VS EISGPFSTEDI ASN CVP AQALN E P I LW I VS YALMS VCG 476 

Q y 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I M I I I I : I I I I I I : I I I I I I I I I I I II I I I I I I 

Db 477 AILLVLILLLLLPLHCRHAPRDPEWNDESSLVRHRWK 514 



RESULT 2 
Q8C793 

ID Q8C793 PRELIMINARY; PRT; 514 AA. 

AC Q8C7 93; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) % 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Beta-site APP-cleaving enzyme 2. 

GN BACE2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Heart ; 

RX MEDLINE-22354683; PubMed=124 66851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,77 0 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK052309; BAC34931.1; 

DR MGD; MGI: 1860440; Bace2 . 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

SQ SEQUENCE 514 AA; 55871 MW; 8BF45E07B0990225 CRC64; 

Query Match 89.3%; Score 2399; DB 11; Length 514; 

Best Local Similarity 88.6%; Pred. No. le-178; 

Matches 459; Conservative : 20; Mismatches 35; Indels 4; Gaps 1; 

Qy 1 MGALjARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRW 60 

| | || I I I I I I : I I I I I I I I I I I I I M II I : I I I I I : I I M I 'III 
Db 1 MGALLR7VLLLPVLAQWLLSAVPALAPAPFTLPLQVARATNHRASAVPGLGTPELPRADGL 60 

Qy 61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

| | | | | | | : I I I I I I I I I I I II I II II I I I I I I I II I I I : I I I I I I I I M I I I I I 

Db 61 ALALEPVRAT ANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQILVDTGSSNFAVAG 116 



Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

||||||||||:| | I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I 

Db 117 APHSYIDTYFDSESSSTYHSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATI 176 

Qy 181 FESENFFLPGI KWN G I L GLAYAT LAKP S S S LET F FD S LVTQAN I PNVF SMQMC GAG L P VA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I :: I I I I I I I I I I I I I 

Db 177 FESENFFLPGI KWNGILGLAYAALAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVA 236 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I | I || I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 

Db 237 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQNLNLDCREYNADKA 296 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I | | || I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I : I I I I I I I I I I I I : I I I I I I I 

Db 297 IVDSGTTLLRLPQKVFDAWEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISI 356 

Qy 361 YLRDENS SRS FRITI LPQLYIQPMMGAGLNYECYRFGI S PSTNALVIGATVMEGFYVI FD 420 

I I I I I I : I II I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I II I I I I I I I I I I : I I 
Db 357 YLRDENAS RS FRTT I LPQL YIQPMMGAGFNYECYRFGI S S STNALVI GATVMEGFYWFD 416 

Qy 421 RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVASN CVP AQ S L S E P I LW I VS YALMS VC G 480 

| | | : I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I : I : I II I I I II I I I I I I I I 
Db 417 RAQ RRVG FAVS P CAE I EGT TVS EISGPFSTE D I AS N CVP AQALN E P I LW I VS YALMS VC G 476 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I : I I I I I I : I I I I I I I I I I I I I I I I I I I 
Db 477 AI LLVLI LLLLLPLHCRHAPRDPEWNDES SLVRHRWK 514 



RESULT 3 
Q9JL18 

ID Q9JL18 PRELIMINARY; PRT; 514 AA. 

AC Q9JL18; 

DT 01-OCT-2000 (TrEMBLrel . 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Aspartyl protease 1. 

GN BACE2 . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Choi D.K., Sugano S., Sakaki Y.; 

RT "Molecular characterization of the mouse Aspl gene, a homolog of the 

RT human ASP1 (Down Syndrome Region aspartyl protease)."; 

RL Submitted (DEC-1999) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF216310; AAF36599.1; -. 

DR HSSP; P00797; 2 REN . 

DR MEROPS; A01.041; -. 

DR MGD; MGI: 1860440; Bace2.. 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0008233; F:peptidase activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR Inter Pro; IPR001969 ; Aspprotease_AS . 



DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 514 AA; 55799 MW; A70725F2C1DF5B47 CRC64 ; 

Query Match 89.1%; Score 2395; DB 11; Length 514; 

Best Local Similarity 88.6%; Pred. No. 2.1e-178; 

Matches 459; Conservative 20; Mismatches 35; Indels 4; Gaps 1; 

Qy 1 MGALARALL L P LLAQWL L RAAP E LAP AP FT L P L RVAAATN RWAP T P G P GT P AE RHAD GL 60 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I : I I I I I : I I I I I I I I I 

Db 1 MGALLRALLLLVLAQWLLSAVPALAPAPFTLPLQVAGATNHRASAVPGLGTPELPRADGL 60 

Qy 61 ALALEPALAS PAGAANFIAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I II I I : I I I I I I I I I I I I I I I I II I I I I I I I I I I I I : I I I I I I I I I I I I I I I 

Db 61 ALALEPVRAT ANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVQILVDTGSSNFAVAG 116 

Qy 121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I : I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I 
Db 117 APHSYIDTYFDSESSSTYHSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATI 176 

Qy 181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M :: I I I I I I I I II I I I 
Db 177 FESENFFLPGIKWNGILGLAYAALAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVA 236 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 

Db 237 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQNLNLDCREYNADKA 296 

Qy 301 IVT^SGTTLLRLPQWFDAWE^VVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I : I I I I I I I 
Db 297 IVDSGTTLLRLPQKVFDAVVEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISI 356 

Qy 361 YLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFD 420 

I I I I I I : I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I : I I 

Db 357 YLRDENAS RS FRI T I LPQLYI Q PMMGAGFNYEC YRFGI S S STNALVI GATVMEGFYWFD 416 

Qy 421 RAQ K RVG F AAS P CAE I AG AAVS EISGPFSTE D VAS N C VP AQ S L S E P I L W I VS Y ALM S VC G 480 

I I I : I I I I I II I I I I I II I I I I I I I I I I : I I I I I I I I : I : I I I I I I I I I I I I I I I I 
Db 417 RAQ RRVG FAVS P CAE I E GT TVS EISGPFSTE D I ASN CVP AQALN E P I LW I VS YALMS VC G 476 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

1111111:11111 I : I I II I I I I I I I I I I M I I I 
Db 477 AILLVLILLLLLPLHCRHAPRDPEWNDESSLVRHRWK 514 



RESULT 4 
Q9NZL2 

ID Q9NZL2 PRELIMINARY 
AC Q9NZL2; 

DT 01-OCT-2000 (TrEMBLrel. 
DT 01-OCT-2000 (TrEMBLrel. 
DT 01-OCT-2003 (TrEMBLrel. 
DE Aspartyl protease. 



PRT; 468 AA. 
15, Created) 

15, Last sequence update) 
25, Last annotation update) 



GN BACE2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20422477; PubMed-10965118 ; 

RA Solans A. , Estivill X., de La Luna S.; 

RT "A new aspartyl protease on 21q22.3, BACE2, is highly similar to 

RT Alzheimer ! s amyloid precursor protein beta-secretase. "; 

RL Cytogenet. Cell Genet. 89:177-184(2000). 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF188276; AAF35835.1; 

DR HSSP; P00797; 2 REN . 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0008233; F:peptidase activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 468 AA; 50324 MW; 717E092 0126A0142 CRC64; 

Query Match 88.4%; Score 2375; DB 4; Length 468; 

Best Local Similarity 90.3%; Pred. No. 6.6e-177; 

Matches 468; Conservative 0; Mismatches 0; Indels 50; Gaps 1; 

MGAIJ^^LLPLIAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
MGALARALLLPLIAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I ! I I I I I I I 



Qy 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 

TDHCiVTiiTvs'nTP'DtjqTvn.qKaFmmnri'TncisWTGF'VGF.DLVTIPKGFNTSFLVNIATI 180 



I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I M I I I I 

FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 



I | | I I I I II I I I I I I I I II I I I I I I I I I I I I II I II I II I I I I I I I I I I I I I I I I I I I I I 



IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 
I I I I I II I I I I I I I I I I I I I I I I I I I I I 

IVDSGTTLLRLPQKVFDAWEAVARASL 328 



II I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I 



Db 



329 



L YI QPMMGAGLN YEC YRFGI S PSTNALVT GATVMEGFYVI FD 370 



Qy 421 RAQ KRVG FAAS P CAE I AGAAVS E I S G P FST EDVAS NCVP AQ S L S E P I LW I VS YALMS VC G 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

Db 371 RAQ KRVG FAAS P CAE I AGAAVS E I S G P F S T EDVAS NCVP AQ S L S E P I LW I VS YALMS VC G 430 

Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I II I I I II I I I I II I I I I I I I I I I I I I I I I I I 

Db 431 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 468 



RESULT 5 
Q9H2V8 

ID Q9H2V8 PRELIMINARY; PRT; 439 AA. 

AC Q9H2V8; 

DT 01-MAR-2001 (TrEMBLrel . 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE CDA13. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Pheochromocytoma; 

RA Li Y., Huang Q., Peng, y, Song H. , Yu Y., Xu S., Ren S., Chen Z., 

RA Han Z . ; 

RL Submitted (DEC-1999) to the EMBL/ GenBank/DDBJ databases. 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AF212252; AAG41783.1; 

DR HSSP; P00797; 2 REN . 

DR GO; GO:0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0008233; F:peptidase activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 439 AA; 48275 MW; 02EC0E0E50F11602 CRC64; 

Query Match 85.3%; Score 2293; DB 4; Length 439; 

Best Local Similarity 100.0%; Pred. No. 1.5e-170; 

Matches 439; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 80 MVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDTYFDTERSSTYR 139 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I M 

Db 1 MVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDTYFDTERSSTYR 60 

Qy 140 SKGFDVTTWYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPGIKWNGILGL 199 

I I I I II II II I I I I II I I II I I I I I II I I I I I I I I I II II I I I I I I I I I I I I I I I I M I I 

Db 61 SKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPGIKWNGILGL 120 



Qy 



200 AYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLY 259 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 121 AYATIAKPSSSLETFFDSLVTQANIPNVESMQMCGAGLPVAGSGTNGGSLVLGGIEPSLY 180 

Qy 260 KGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAV 319 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 KGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAV 240 

Qy 320 VEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQL 379 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 VEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQL 300 

Qy 380 YIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFWIFDRAQKRVGFAASPCAEIAGA 439 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 YIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFDRAQKRVGFAASPCAEIAGA 360 

Qy 440 AVS EI S GP FSTEDVASNCVPAQS LS E P I LWI VS YALMS VCGAI LLVLI VLLLLP FRCQRR 499 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I 
Db 361 AVSEISGPFSTEDVASNCVPAQSLSEPI LWI VSYALMSVCGAI LLVLI VLLLLPFRCQRR 420 

Qy 500 P RDPEWNDES S LVRHRWK 518 

I I I I I I II I I I I I I I I I II 
Db 421 PRDPEWNDESS LVRHRWK 439 



RESULT 6 
Q8N2D4 

ID Q8N2D4 PRELIMINARY; PRT; 423 AA. 

AC Q8N2D4; 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Hypothetical protein OVARC1000363 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE^Ovarian carcinoma; 

RA Ota T., Nishikawa T., Suzuki Y., Kawai-Hio Y. , Hayashi K., Ishii S., 

RA Saito K., Yamamoto J., Wakamatsu A., Nagai T., Nakamura Y., 

RA Nagahari K., Sugano S., Isogai T.; 

RT "HRI human cDNA sequencing project."; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK075539; BAC11682.1; -. 

DR GO; GO: 0004194; Frpepsin A activity; IEA. 

DR GO; GO:0006508; P : proteolysis and peptidolysis; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hypothetical protein. 

SQ SEQUENCE 423 AA; 46457 MW; 4D4839F2ED9C2CE1 CRC64; 



Query Match 



81.3%; Score 2184; DB 4; Length 423; 



Best Local Similarity 99.3%; Pred. No. 4.7e-162; 

Matches 42 0; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 

Qy 96 MLIGTPPQKLQI LVDTGS SNFAVAGTPHS YI DT YFDTERS STYRS KGFDVTVKYTQGSWT 155 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
Db 1 MLI GTPPQKLQI LVDTGS SNFAVAGTPHS YI DT YFDTERS STYRS KGFDVTVKYTQGSWT 60 

Qy 156 GFVGEDLVTIPKGFNTSFLWIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFF 215 

I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I : II I I I I I I I I I I I I I I I I I I I I I 

Db 61 GFVGEDLVTIPKGFNTSFLVNIATIFESGNFFLPGIQWNGILGLAYATLAKPSSSLETFF 120 

Qy 216 DSLVTQANIPNVFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQ 275 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 121 DSLVTQANIPNVFSMQMRGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQ 180 

Qy 276 IEILKLEIGGQSLNLDCREYN7VDKAIVDSGTTLLRLPQKVFDAVVEAV7VRASLIPEFSDG 335 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 IEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDG 240 

Qy 336 FWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNYECYR 395 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M 
Db 241 FWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNYECYR 300 

Qy 396 FG I S P S TN ALVI GAT VME G F YVI FD RAQKRVGFAAS P CAE I AGAAVS EISGPFSTE DVAS 455 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I 
Db 301 FG I S P S TNALVI GAT VME G F YVI F D RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE DVAS 360 

Qy 456 NCVPAQSLSEPILWIVSYALMSVCGAILLVLIVLLLLPFRCQRRPRDPEVVNDESSLVRH 515 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I 
Db 361 NCVPAQSLSEPILWIVSYALMSVCGAILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRH 420 

Qy 516 RWK 518 

I I I 

Db 421 RWK 423 



RESULT 7 
Q9NZL1 

ID Q9NZL1 PRELIMINARY; PRT; 396 AA. 

AC Q9NZL1; 

DT 01-OCT-2000 (TrEMBLrel. 15 f Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Aspartyl protease. 

GN BACE2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20422477; PubMed=10965118 ; 

RA Solans A., Estivill X. , de La Luna S.; 

RT "A new aspartyl protease on 21q22.3, BACE2 , is highly similar to 

RT Alzheimer's amyloid precursor protein beta-secretase . " ; 

RL Cytogenet. Cell Genet. 89:177-184(2000). 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 



DR EMBL; AF188277; AAF35836.1; -. 

DR HSSP; P00797; 2 REN. 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0008233; F:peptidase activity; IEA. 

DR GO; GO:0006508; P : proteolysis and peptidolysis; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Aspartyl protease; Hydrolase; Protease. 

SQ SEQUENCE 396 AA; 43013 MW; 5023A7AF391CEAC9 CRC64; 



Query Match 73.2%; Score 1966; DB 4; Length 396; 

Best Local Similarity 100.0%; Pred. No. 4.5e-145; 

Matches 378; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 



1 MGALARALL L P L LAQWL L RAAP E LAP AP FT L P L RVAAATNRWAPT P G P GT P AE RHAD GL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 

1 MGALARALL L P L LAQWL L RAAP ELAP AP FT L P L RVAAATNRWAPT P G P GT P AE RHAD GL 60 



Qy 

Db 



61 ALAL E P ALAS P AGAAN FLAMVDN LQ GD S G RG Y Y L EML I GT P P Q KLQ I L VDT G S S N FAVAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 ALAL EP ALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 



Qy 

Db 



121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 



Qy 

Db 

Qy 

Db' 



181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 



Qy 

Db 

Qy 

Db 



301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I M II I I I I I I I I I II I I I I 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

361 YLRDENSSRSFRITILPQ 378 

I I I I I I I I I I I I I I I I I I 

361 YLRDENSSRSFRITILPQ 378 



500 AA. 



RESULT 8 
Q7T0Y2 

ID Q7T0Y2 PRELIMINARY; PRT; 

AC Q7T0Y2; 

DT 01-OCT-2003 (TrEMBLrel. 25, Created) 
DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 
DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 
DE Hypothetical protein. 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



OC Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus . 

OX NCBI_TaxID=8355; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Embryo; 

RX MEDLINE=22341132; PubMed-12454 9 17 ; 

RA Klein S.L., Strausberg R.L., Wagner L. f Pontius J., Clifton S.W., 

RA Richardson P.; 

RT "Genetic and genomic tools for Xenopus research: The NIH Xenopus 

RT initiative . " ; 

RL Dev. Dyn. 225:384-391(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE^ Embryo; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T,, Max S.I., Wang J., Hsieh F . , 

RA Diatchenko L . , Marusina K., Farmer A. A. , Rubin G.M., Hong L . , 

RA Stapleton M. , Scares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P. r Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C. f Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U. f Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Embryo; 

RA Klein S., Strausberg R. ; 

RL Submitted (AUG-2003) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; BC055989; AAH55989.1; 

KW Hypothetical protein. 

SQ SEQUENCE 500 AA; 54722 MW; 10F16756CAFDCD0B CRC64; 

Query Match 63.0%; Score 1693; DB 13; Length 500; 

Best Local Similarity 63.6%; Pred. No. 1.4e-123; 

Matches 322; Conservative 74; Mismatches 100; Indels 10; Gaps 4; 

Qy 13 LAQW L L RAAP E LAP AP FT L P L RVAAATNRWAP T P G P GT P AE RHADGLALAL E P ALAS PA 72 

I : I I I I I : I I I : I : I III:: I I I I : I 

Db 5 L VRL L L L C AAAC AS N K F I VP LN VS P AE I KGT L P V- AP AT P K D K — PGLLLA SDPG 56 

Qy 73 GAANFLAMVTDNLQGDSGRGYYLEMLI GT PPQKLQI LVDTGS SNFAVAGTPHS YI DT YFDT 132 

I || : || I I I I I I I I I I I I I : I I I : I I I I : I I I I I I I I I I I I I I : I : : : I : I I : 

Db 57 GTINFFSMVDNLAGDSGRGYYLELLIGSPPQKVNILVDTGSSNFAVAGSPNPDVNTFFDS 116 



Qy 133 ERSSTYRSKGFDWVKYTQGSWTGFVGEDLVTIPKGFNTSFLWIATIFESENFFLPGIK 192 

: I : : I : I : I I I : I I I I I I I I : I : I : I : I I I I I : I I : I I I : I I : I I : I I I I I 
Db 117 KLSTSYQSLNTEVTVRYTQGSWTGLLGKDWSIPKGVNGTFLINIASIFQSESFFLPNIN 176 

Qy 193 WNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVAGSGTNGGSLVLG 252 

I I I I I I I I : I I I I I I I I : I I I I I I I I I I I : I I I I I I I I I I I : I I I I I I I I 

Db 177 WQGILGLAYSTLAKPSSSVEPFFDSLVQQENIPDVFSMQMCGAGQSSPGNGINAGSLVLG 236 

Qy 2 53 GIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLP 312 

I : I I I I I I I : I I I I I I I I I I I I : I : I I hill I I I I I I I : I I I I I I I I I I I I I I I 

Db 237 GVEPSLYKGNIWYTPITEEWYYQVEVLKFEVGGQRLNLDCTVYNSDKAIVDSGTTLLRLP 296 

Qy 313 QKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFR 372 

I I I : I : I : I : : III I : II I I I I I I :: I I : I II I I I I I I I I : I I I I I 

Db 297 DKVFNA1WDAIVQTSLIQNFN7VEFWAGLQLACWDKTQQPWNYFPDISIYLRDTNTSRSFR 356 

Qy 373 ITILPQLYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFDRAQKRVGFAASP 432 

: I : I I I I I I : : : I : I I I I I I : I I I I I II I I I I I I I I I I I : I I I I I I I 

Db 357 LTLKPQLYIQSVLTFQESLNCFRFGISQSASTLVIGATVMEGFYVIFDRAEKRVGFAVSS 416 

Qy 433 CAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCGAILLVLIVLLLL 492 

I I I : : I lllhlll I I I : I I I : I I I I : I I : I I : I I I : II I I I I I : : I I I I 

Db 417 CAEVSGITVSEIAGPFGTSDVSSNCIARNPLREPIMWIISYSLMSLCGMILLVLVILLLL 476 

Qy 4 93 PFRCQRRPRDPEWNDESSLVRHRWK 518 

I : I I I : I I I I I I I : I I I I 

Db 477 SNR — QRHDDMET INDES S LVQHRWK 500 



RESULT 9 
Q9R1P7 

ID Q9R1P7 PRELIMINARY ; PRT; 255 AA. 

AC Q9R1P7; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Aspartyl protease (Fragment) . 

GN BACE2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Accarino M., Fumagalli P., Taramelli R. , Ottolenghi S.; 

RT "Cloning of a gene from chromosome 21 Down Region encoding a potential 

RT transmembrane protease."; 

RL Submitted (FEB-1998) to the EMBL/GenBank/ DDBJ databases. 

DR EMBL; AF051150; AAD45964.1; 

DR MEROPS; A01.041; 

DR MGD; MGI: 1860440; Bace2 . 

DR GO; GO: 0004190; F : aspar tic- type endopeptidase activity; IEA. 

DR GO; GO: 0008233; F:peptidase activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR009007; Pept_A_acid. 

DR PROSITE; PS00141; ASP PROTEASE; 1. 



KW Protease. 

FT NON_TER 1 1 

SQ SEQUENCE 255 AA; 28685 MW; 53DE317815996D63 CRC64; 

Query Match 46.4%; Score 1246; DB 11; Length 255; 

Best Local Similarity 91.0%; Pred. No. 4e-89; 

Matches 232; Conservative 12; Mismatches 11; Indels 0; Gaps 0; 

Qy 264 WYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAVVEAV 323 

I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 1 WYTPI KEEWYYQI EI LKLEI GGQNLNLDCREYNADKAI VDSGTTLLRLPQKVFDAVVEAV 60 

Qy 324 ARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQP 383 

II I I I I I I I I I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I II : I I I I I I I I I I I I I I I I 

Db 61 ARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISIYLRDENASRSFRITILPQLYIQP 120 

Qy 384 MMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FDRAQKRVGFAAS PCAEIAGAAVSE 443 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I : I I I I I : I I I I I I I I I I I I III 
Db 121 MMGAGFN YECYRFGI S S STNALVI GATVMEGFYWFDRAQRRVGFAVS PCAEI EGTTVSE 180 

Qy 444 ISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVCGAILLVLIVLLLLPFRCQRRPRDP 503 

I I I I M I I I : I I II I I I I : I : I I I I I I I I I I I I I I I I I I I I I I I : I I I : I I : I I I I 

Db 181 ISGPFSTEDIASNCVPAQALNEPILWIVSYALMSVCGAILLVLILLLLVPLHCRHAPRDP 24 0 

Qy 504 EWNDES SLVRHRWK 518 

I I I I I I I I I I I I I I I 
Db 241 EWNDES SLVRHRWK 255 



RESULT 10 
Q8C7R1 

ID Q8C7R1 PRELIMINARY; PRT; 501 AA. 

AC Q8C7R1; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Beta-site APP cleaving enzyme. 

GN BACE . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=l 0 090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Spinal cord; 

RX MEDLINE=22354683; PubMed=12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK049626; BAC33844.1; 

DR MGD; MGI: 1346542; Bace. 

DR GO; GO:0004194; F:pepsin A activity; IEA. 

DR GO; GO:0006508; P : proteolysis and peptidolysis; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase Al . 



DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

SQ SEQUENCE 501 AA; 55761 MW; B410DA8B64647663 CRC64; 

Query Match 44.1%; Score 1186; DB 11; Length 501; 

Best Local Similarity 46.2%; Pred. No. 5.4e-84; 

Matches 238; Conservative 82; Mismatches 169; Indels 26; Gaps 7; 

Qy 9 L L P L LAQWL L RAAP E LAP AP FT L P LRVAAATN RWAP T P G P GT P AE RHAD GLALA 63 

: I I II : I I I I I I I I II II : 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES 51 

Qy 64 L E P ALAS P AGAAN FLAMVDN LQ GD S GRG Y Y L EML I GT P P Q KLQ I L VDT G S S N FAVAGT P H 123 

I : I : I I I I I : I I I : I I I : I I I I : I I I I I I I I I I I I II I I II 

Db 52 EEPGRRGSFVEMVDNLRGKSGQGYYVEMTIGSPPQTLNILVDTGSSNFAVGAAPH 106 

Qy 124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

: : I : : II I I I I I I I I I I I : I I I I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVWPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 

Qy 184 EN F FL P G I KWN G I LGLAYAT LAK PSSSLETFFDS LVT QAN I PN VFSMQMC GAGL P V A 240 

: ||: I I I I I I I I I I : I : I III II I I I I I : I I I : I I : I : I I I I I : 
Db 167 DKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 226 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

: I I I : : : I I I : III I : I I I I I : INI:: I : : : I I II I : I I : I I I I I : 

Db 227 ALASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKS 286 

Qy 301 IVDSGTTLLRLPQKVFDAVV^VARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I I I I : I I I : I I :: : II : I I I I I I I I I I II I : I I I I : 

Db 287 IVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISL 34 6 

Qy 361 YLRDENS S RS FRI T I LPQL YI QPMMGAGLN Y- EC YRFGI S P STNALVI GATVMEGF YVI F 419 

II I : : : I I I I I I I I I I : : I : : : I I : I : I I : I : I I : I I I I I I.: I 

Db 347 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 406 

Qy 420 D RAQK RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N CVP AQ SLSEPILWI VS YALM SVC 479 

I I I : I I : II I I I : : | | | | | : I I : : I : : : I 

Db 407 DRARKRIGFAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAIC 466 

Qy 480 GAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

I : : : : I : : : I I I I : : : I I I 

Db 467 -ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 11 
Q9ULS1 

ID Q9ULS1 PRELIMINARY; PRT; 532 AA. 

AC Q9ULS1; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-OCT-2001 (TrEMBLrel. 18, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Hypothetical protein KIAA1149 (Fragment) . 

GN KIAA1149. 

OS Homo sapiens (Human) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI__TaxI D=9 6 0 6 ; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Brain; 

RX MEDLINE=20039618; PubMed=105744 61 ; 

RA Hirosawa M. , Nagase T , , Ishikawa K., Kikuno R. , Nomura N., Ohara O. ; 

RT "Characterization of cDNA clones selected by the GeneMark analysis 

RT from size-fractionated cDNA libraries from human brain."; 

RL DNA Res. 6:32 9-336(1999). 

CC -!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY Al . 

DR EMBL; AB032975; BAA86463.2; -. 

DR HSSP; P56272; 1AM5 . 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0008233; F:peptidase activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

KW Hypothetical protein; Aspartyl protease; Hydrolase; Protease. 

FT NONJTER 1 1 

SQ SEQUENCE 532 AA; 58720 MW; 98B135D0D5FBD2E8 CRC64; 

Query Match 44.0%; Score 1183.5; DB 4; Length 532; 

Best Local Similarity 47.8%; Pred. No. 9.3e-84; 

Matches 231; Conservative 81; Mismatches 154; Indels 17; Gaps 6; 

Qy 44 APTPGPGTPAERHADGLALAL EPALAS PAGAANFLAMVDNLQGDSGRGYYLE 95 

II: II I II II I : I : I I I I I : I I I : I I I : I 

Db 52 APSTASGCPCAAAWGGAPLGLRLP RET DEEP — EEPGRRGSFVEMVDNLRGKSGQGYYVE 109 

Qy 96 MLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWT 155 

I : I : I I I I I I I I I I I I I I I I II:: I : • I I I I I I I I M I I 

Db 110 MTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWE 169 

Qy 156 GFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFF 215 

I : I I I I : II I I : I I I I I I : I I : I I I I I I I I I I : I : I I I I I I 

Db 170 GELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYiQLEIARPDDSLEPFF 229 

Qy 216 DSLVTQANI PNVFSMQMCGAGLPVAGS GTNGGS LVLGGI EP SLYKGDIWYT P I KEEW 272 

I II I I : : I I : I I : I : I I I I I : I : M I : : : I I I : III I : I I I I I : II 

Db 230 DSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREW 28 9 

Qy 273 YYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEF 332 

II:: I : : : I I II I : I I : I I I I I : I I I I I I I I I I I : I I I : I I : : : II : I 
Db 290 YYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKF 34 9 

Qy 333 SDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY- 391 

I I I I I I I I I I I I : I I I I : I I I : : : I I I I I I I I I I : : I : : 

Db 350 PDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQD 409 

Qy 392 ECYRFGISPS TN AL VI GAT VME G F YVI FD RAQ KRVG FAAS P CAE I AGAAVS EISGPFSTE 451 

: I I : I I I I : I : I I : I I I I II : I I I I : I I : I I I I I : : II I I 



Db 



410 



DCYKFAISQSSTGTVMGAVIMEGFYVVFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTL 469 



Qy 452 DVASNCVPAQSLS EP I LWI VS YALMS VCGAI LLVL I VLLLLP FRCQR — RPRDPEWNDE 509 

I. J I ••[•••||» * I I I I* " * I 

* I I •••(!• •••!»» * I I I I* * * I 

Db 470 DMEDCGYNIPQTDESTLMTIAYVMAAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDI 528 

Qy 510 SSL 512 

I I 

Db 529 SLL 531 



RESULT 12 
Q8BQY4 

ID Q8BQY4 PRELIMINARY; PRT; 501 AA. 

AC Q8BQY4; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Beta-site APP cleaving enzyme. 

GN BACE . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=22354683; PubMed=12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60, 770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK046175; BAC32620.1; -. 

DR MGD; MGI: 134 6542; Bace. 

DR GO; GO:0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A__acid . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

SQ SEQUENCE 501 AA; 55816 MW; C0855513145E024E CRC64; 

Query Match 44.0%; Score 1183; DB 11; Length 501; 

Best Local Similarity 46.0%; Pred. No. 9.2e-84; 

Matches 237; Conservative 83; Mismatches 169; Indels 26; Gaps 7; 

Qy 9 LLPLLAQWLLRAAPELAPAPFT L P L RVAAATN RWAP T P G P GT P AE RHADGLALA 63 

: I I II : I I I I I I I I II II : 
Db 1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES 51 

Qy 64 LEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPH 123 

I : I : I I I I I : I I I : I M : I I : I : I I I I I I I I I I I I I I II II 

Db 52 EEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPH 106 



Qy 124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

: : I : : I I I I I I I I I I I I I : I I I I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 

Qy 184 ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 240 

: II: I I I I I I I I I I : I : I III I I I I I I I : I I I : I I : I : I I I I I : 
Db 167 DKFFINGSNWEGI LGLAYAEIARPDDSLEPFFDSLVKQTHI PNI FSLQLCGAGFPLNQTE 226 

Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

: I I I : : : I I I : I I I I : I I I I I : I I I I : : I : : : I I II I : I I : I I I I I : 
Db 227 ALASVGGSMIIGGIDHSLYTGRLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKS 286 

Qy 301 IVDSGTTLLRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

I I I I I I I I II I : I I I : I I : : : II : I I I I I I I I I I I I I : I I I I : 

Db 287 IVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISL 346 

Qy 361 YLRDENS S RS FRI TI LPQL YI QPMMGAGLN Y- EC YRFGI S P STNALVI GATVMEGFYVI F 419 

II I : : : I I I I I I I I I I : : I : : : I I : I : I I : I : I I : I I I I I I : I 

Db 347 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 4 06 

Qy 420 D RAQ K RVG FAAS P CAE I AGAAVS EISGPFSTE D VAS N C VP AQ S L S E P I LW I VS Y ALMS VC 479 

I I I : I I : I I I I I : : I I I I I : I I : : I : : : I 

Db 407 DRARKRIGFAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVM7\AIC 4 66 

Qy 480 GAI LLVLI VLLLL PFRCQR — RPRDPEWNDESSL 512 

I : : : : I : : : I I I I : : : I I I 
Db 467 -ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 13 
Q8IYC8 

ID, Q8IYC8 PRELIMINARY; PRT; 501 AA. 

AC Q8IYC8; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Beta-site APP-cleaving enzyme. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (JUL-2002) to the EMBL/GenBank/DDB J databases. 

DR EMBL; BC036084; AAH36084.1; 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

SQ SEQUENCE 501 AA; 55824 MW; 768 595CF5517EFB7 CRC64; 



Query Match 43.6%; Score 1172.5; DB 4; Length 501; 

Best Local Similarity 46.1%; Pred. No. 6.1e-83; 

Matches 239; Conservative 82; Mismatches 165; Indels 33; Gaps 



9; 



Qy 7 ALLLPLLAQWLLRAAPELAPAPFT LPLRVAAATNRWAPTPGPGTPAERHADGLA 61 

I I I I I : : I I I I I I I II II 
Db 2 AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 42 

Qy 62 LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVA 119 

I I I : I : I I I I I : I I I : I I I : I I : I : I I I I I I II I I I I I I I I 

Db 43 LPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYWEMTVGSPPQTLNILVDTGSSNFAVG 102 

Qy 12 0 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 17 9 

II:: I : : I I I I I I I I I I I I : I I I I : I I I I : III 

Db 103 AAPHPFLHRYYQRQLFSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

Qy 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

I I I : I I : I I I I I I I I I I : I : I I I I I I I I I I I : : I I : I I : I : I I I I I : 
Db 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPL 222 

Qy 240 AGS GTNGGSLVLGGI EPSLYKGDIWYTPI KEEWYYQI EI LKLEI GGQS LNLDCREYN 296 

I : I I I : : : I I I : I I I I : II I I I : III:: I : : : I I II I : I I : I I I 

Db 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 

Qy 297 7\J)KAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQIA.CWTNSETPWSYFP 356 

I I : I I I I I I I I I I I : I I I : I I :: : II : I II I I I I I I I I I I : I I 

Db 283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 

Qy 357 KISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGF 415 

I I : I I I : : : I I I I M I I I I : : I : : : I I : I I I I : I : I I : I I I I 

Db 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 402 

Qy 416 YVI FDRAQKRVGFAAS P CAE I AGAAVS E I S GP FS T EDVASNCVPAQ S L S E P I LWI VS YAL 475 

I I : I I I I : I I : I I I I I : : I I I I I : I I : : I : 

Db 403 YWFDRARKRIGFAVSACHWDEFRTAAVEGPFWLDMEDCGYNIPQTDESTLMTIAYVM 462 

Qy 476 MSVCGAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

: : I I : : : : I : : Mil I : : : I I I 
Db 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 



RESULT 14 
Q8C4F4 

ID Q8C4F4 PRELIMINARY; PRT; 467 AA. 

AC Q8C4F4; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Beta-site APP cleaving enzyme. 

GN BACE . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-C57BL/ 6J; TISSUE=Cerebellum; 



RX MEDLINE=22354683; PubMed=12466851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK082317; BAC38462.1; 

DR MGD; MGI: 1346542; Bace. 

DR GO; GO:0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 1. 

SQ SEQUENCE 467 AA; 52063 MW; 31AB674FF1843652 CRC64; 



Query Match 39.0%; Score 1049; DB 11; Length 467; 

Best Local Similarity 41.7%; Pred. No. 2.4e-73; 

Matches 215; Conservative 77; Mismatches 163; Indels 60; Gaps 8; 



Qy 9 LLP LLAQWLLRAAPELAPAP FT LPLRVAAATNRWAPTPGPGTPAERHADGLALA 63 

: I I II : I I I I II I I II II : 

Db 1 MAPALHWLLLWVGSGMLPAQGTHLGI RLPLRSGLA GPPLGLRLPRETDEES 51 



Qy 64 LEPALASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPH 123 

I : I : I I I I I : I I I : I M : I I : I : I I I I I I I I I I I I I I I I I I 

Db 52 EEPGRRGS FVEMVDNLRGKSGQGYYVEMTVGS PPQTLNI LVDTGS SNFAVGAAPH 106 

Qy 124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 

:: I : : I I I I I I I I I I I I I : I I I I : I I I I : I I I I I I 

Db 107 PFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITES 166 



Qy 184 ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 240 

: ||: I I I I I I I I I I : I : I III I I I I I I I : I I I : I I : I : I I I I I : 
Db- ' 167 DKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 226 



Qy 241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

: I I I : : : I I I : III I : I I I I I : MM:: | : : : | | II I : I I : I 

Db 227 ALAS VGGSMI I GGI DHS L YTGS LW YT P I RREWYYEVI I VRVEINGQDLKMDCKE 280 

Qy 301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

: I I I I I I II I I 111:1111: 
Db 281 TEKFPDGFWLGEQLVCWQAGTTPWNIFPVTSL 312 

Qy 361 YLRDENS SRS FRITI LPQLYIQPMMGAGLNY- ECYRFGI S PSTNALVI GATVMEGFYVI F 419 

II | : : : | | | | | | I I I I : : I : : : I I : I : I I : I : I I : I I I I I I : I 

Db 313 YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 372 

Qy 420 DRAQKRVGFAASPCAEIAGAAVSEISGPFSTEDVASNCVPAQSLSEPILWIVSYALMSVC 479 

I I I : I I : I I I I I : : | | | | | : I I : : I : : : I 

Db 373 DRARKRIGFAVSACHVHDEFRTAAVEGPFVT7U)MEDCGYNIPQTDESTLMTIAYVMAAIC 432 



Qy 480 GAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

I : : : : I : : : I I I I : : : I I I 
Db 433 -ALFMLPLCLMVCQWRCLRCLRHQHDDFADDI SLL 466 



.RESULT 15 
Q9CUU5 

ID Q9CUU5 PRELIMINARY; PRT; 2 67 AA. 

AC Q9CUU5; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Adult male brain cDNA, RIKEN full-length enriched library, 

DE clone : 352 64 02A15 product :beta-site APP cleaving enzyme, full insert 

DE sequence (Fragment) . 

GN BACE . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RA Adachi J., Aizawa K., Akahira S., Akimura T., Arai A., Aono H., 

RA Arakawa T., Bono H., Carninci P., Fukuda S., Fukunishi Y., Furuno M 

RA Hanagaki T., Hara A., Hayatsu N., Hiramoto K. , Hiraoka T., Hori F. , 

RA Imotani K. , Ishii Y., Itoh M., Izawa M. , Kasukawa T., Kato H. , 

RA Kawai J., Kojima Y., Konno H., Kouda M. , Koya S., Kurihara C, 

RA Matsuyama T., Miyazaki A., Nishi K., Nomura K., Numazaki R. , Ohno M 

RA Okazaki Y. , Okido T., Owa C, Saito H. , Saito R. , Sakai C, Sakai K 

RA Sano H. , Sasaki D. , Shibata K. , Shibata Y., Shinagawa A., Shiraki T 

RA Sogabe Y. , Suzuki H., Tagami M. , Tagawa A., Takahashi F., Tanaka T. 

RA Tejima Y. , Toya T., Yamamura T., Yasunishi A., Yoshida K., Yoshino ] 

RA Muramatsu M. , Hayashizaki Y. ; 

RL Submitted (JUL-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE^Brain; 

RX MEDLINE=2 2 354683; PubMed=12 4 66851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation 

RT 60,77 0 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=2 1085 660; PubMed=11217 851 ; 

RA RIKEN FANTOM Consortium; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=99279253; PubMed=10349636; 

RA Carninci P., Hayashizaki Y. ; 

RT "High-efficiency full-length cDNA cloning."; 

RL Meth. Enzymol. 303:19-44(1999). 

RN [5] 

RP SEQUENCE FROM N.A. 



Sasaki N., Carninci P 
Tashiro H. , Itoh M. , 



RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=20499374; PubMed=11042159 ; 

RA Carninci P., Shibata Y., Hayatsu N., Sugahara Y., Shibata K. r Itoh M. , 

RA Konno H., Okazaki Y., Muramatsu M. , Hayashizaki Y. ; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes."; 

RL Genome Res. 10:1617-1630(2000). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=20530913; PubMed=110768 61 ; 

RA Shibata K., Itoh M. , Aizawa K. , Nagaoka S., 

RA Konno H., Akiyama J., Nishi K. , Kitsunai T. 

RA Sumi N., Ishii Y., Nakamura S., Hazama M. , Nishine T., Harada A., 

RA Yamamoto R. r Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K., 

RA Fujiwake S., Inoue K. , Togawa Y., Izawa M. , Ohara E., Watahiki M., 

RA Yoneda Y., Ishikawa T., Ozawa K. , Tanaka T., Matsuura S., Kawai J., 

RA Okazaki Y. r Muramatsu M. , Inoue Y., Kira A., Hayashizaki Y. ; 

RT "RIKEN integrated sequence analysis (RISA) system-384-f ormat 

RT sequencing pipeline with 384 multi capillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

DR EMBL; AK014390; BAB29317.2; -. 

DR MGD; MGI: 1346542; Bace. 

DR GO; GO: 0004194; F:pepsin A activity; IEA. 

DR GO; GO: 0006508; P : proteolysis and peptidolysis ; IEA. 

DR InterPro; IPR001461; Peptidase_Al . 

DR InterPro; IPR009007; Pept_A_acid. 

DR Pfam; PF00026; asp; 1. 

FT NON_TER 1 1 

SQ SEQUENCE 267 AA; 30333 MW; 94 13EB4530AB63B0 CRC64; 



Query Match 24.3%; Score 653; DB 11; Length 267; 

Best Local Similarity 45.3%; Pred. No. 8.8e-43; 

Matches 121; Conservative 56; Mismatches 86; Indels 4; 



Gaps 



3; 



Qy 

Db 



249 LVLGGI EPSLYKGDIWYTPI KEEWYYQI EI LKLEI GGQSLNLDCREYNADKAI VDSGTTL 308 
: : : I I I : III I : I I I I I : MM:: | : : : | | | | | : | | : | | I I I : I I I I I I I 
1 MIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTN 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 



309 LRLPQKVFDAVVEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSS 368 



I I I I : I I I : I I : : : II 



: I I I I I I II II 



I I : I I I I : I I I 



61 LRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTN 120 
369 RS FRI T I LPQL YI QPMMGAGLN Y- EC YRFGI S P STNALVI GATVMEGFYVI FDRAQKRVG 427 



I I I I I I I I I I : : I : 



: I I : I : I I : I : I I : I I I I I I : I I II : I I : I 



121 QSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWFDRARKRIG 180 
428 FAAS P C AE I AGAAVS EISGPFSTE DVASN C VP AQ SLSEPILWI VS YALMS VC GAI LLVL I 487 



I I I I 



I I 



I I 



181 FAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAAIC-ALFMLPL 239 



Qy 

Db 



488 VLLLLPFRCQR--RPRDPEWNDESSL 512 

I : : : I I I I : : : I I I 

240 CLMVCQWRCLRCLRHQHDDFADDI SLL 266 



Search completed: March 4, 2004, 15:38:52 
Job time : 76.3936 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: March 4, 2004, 15:22:30 ; Search time 16.5319 Seconds 

(without alignments) 
1631.532 Million cell updates/sec 



Title: US-09-668-314C-2 
Perfect score: 2687 

Sequence: 1 MGALARALLLPLLAQWLLRA RPRDPEWNDESSLVRHRWK 518 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 141681 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : SwissProt_42 : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 



o . 


Score 


Match 


Length 


DB 


ID 




Description 


1 


2687 


100. 


0 


518 


1 


BAE2_ 


_HUMAN 


Q9y5z0 


homo sapien 


2 


1187 


44. 


2 


501 


1 


BACE 


"rat 


P56819 


rattus norv 


3 


1185 


44. 


1 


501 


1 


BACE 


MOUSE 


P56818 


mus musculu 


4 


1178.5 


43. 


9 


501 


1 


BACE 


HUMAN 


P56817 


homo sapien 


5 


363.5 


13. 


5 


377 


1 


PEPC_ 


_MACFU 


P03955 


macaca fuse 


6 


353 


13. 


1 


388 


1 


PEPC 


HUMAN 


P20142 


homo sapien 


7 


351.5 


13. 


1 


388 


1 


PEPC_ 


CALJA 


Q9n2d3 


callithrix 


8 


324.5 


12. 


1 


394 


1 


PEPC] 


_CAVPO 


Q64411 


cavia porce 


9 


320 


11. 


9 


402 


1 


RENl] 


_MOUSE 


P06281 


mus musculu 


10 


315.5 


11. 


7 


396 


1 


catd" 


CLUHA 


Q9dex3 


clupea hare 


11 


313.5 


11. 


7 


509 


1 


APR1~ 


ORYSA 


■ Q42456 


oryza sativ 


12 


313 


11. 


6 


392 


, 1 


PEPC~ 


_RAT 


P04073 


rattus norv 


13 


310 


11. 


5 


383 


1 


PEPE_ 


CHICK 


P16476 


gallus gall 


14 


308.5 


11. 


5 


412 


1 


CATD_ 


HUMAN 


P07339 


homo sapien 


15 


306.5 


11. 


4 


410 


1 


CATD 


MOUSE 


P18242 


mus musculu 


16 


305.5 


11. 


4 


401 


1 


RENS 


_MOUSE 


P00796 


mus musculu 


17 


305 


11. 


4 


407 


1 


catd" 


RAT 


P24268 


rattus norv 



18 


302 


11. 


2 


324 


1 


PEP1 GADMO 


P56272 


gadus morhu 


19 


302 


11. 


2 


405 


1 


CAR P_Y EAST 


P07267 


saccharomyc 


20 


301.5 


11. 


2 


398 


1 


CATE_RAT 


P16228 


rattus norv 


21 


300.5 


11. 


2 


387 


1 


PEP2_RABIT 


P27821 


oryctolagus 


22 


300.5 


11. 


2 


397 


1 


CATE MOUSE 


P70269 


mus musculu 


23 


299 


11. 


1 


398 


1 


CATD CHICK 


Q05744 


gallus gall 


24 


298.5 


11. 


1 


387 


1 


PEP4 RABIT 


P28713 


oryctolagus 


25 


298.5 


11. 


1 


400 


1 


RENI_SHEEP 


P52115 


ovis aries 


26 


297 


11. 


1 


388 


1 


PEPA_HUMAN 


P00790 


homo sapien 


27 


294.5 


11. 


0 


388 


1 


PEP2 MACFU 


P27677 


macaca fuse 


28 


291 


10. 


8 


388 


1 


PEP4 MACFU 


P27678 


macaca fuse 


29 


291 


10. 


8 


402 


1 


RENI_RAT 


P08424 


rattus ' norv 


30 


291 


10. 


8 


406 


1 


RENI_HUMAN 


P00797 


homo sapien 


31 


291 


10. 


8 


406 


1 


RENI PANTR 


P60016 


pan troglod 


32 


290.5 


10. 


8 


396 


1 


CATE RABIT 


P43159 


oryctolagus 


33 


289 


10. 


8 


387 


1 


PEP3_RABIT 


P27822 


oryctolagus 


34 


289 


10. 


8 


388 


1 


PAG_HORSE 


Q28389 


equus cabal 


35 


288.5 


10. 


7 


390 


1 


CATD_BOVIN 


P80209 


bos taurus 


36 


288 


10. 


7 


387 


1 


PEP1 RABIT 


P28712 


oryctolagus 


37 


288 


10. 


7 


388 


1 


PEP1 MACFU 


P03954 


macaca fuse 


38 


287 


10. 


7 


367 


1 


PEPA CHICK 


P00793 


gallus gall 


39 


287 


10. 


7 


391 


1 


CATE_CAVPO 


P25796 


cavia porce 


40 


287 


10. 


7 


396 


1 


CATE HUMAN 


P14091 


homo sapien 


41 


286 


10. 


6 


388 


1 


PEPA MACMU 


P11489 


macaca mula 


42 


285.5 


10. 


6 


387 


1 


PE PASCAL JA 


Q9n2d4 


callithrix 


43 


285 


10. 


6 


396 


1 


CARP_NEUCR 


Q01294 


neurospora 


44 


284.5 


10. 


6 


386 


1 


PEPA_PIG 


P00791 


sus scrofa 


45 


283 


10. 


5 


388 


1 


PEPF RABIT 


P27823 


oryctolagus 



ALIGNMENTS 



RESULT 1 
BAE2_HUMAN 

ID BAE2_HUMAN STANDARD; PRT; 518 AA. 

AC Q9Y5Z0; Q9UJT6; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta secretase 2 precursor (EC 3.4.23.45) (Beta-site APP-cleaving 

DE enzyme 2) (Aspartyl protease 1) (Asp 1) (ASP1) (Membrane-associated 

DE aspartic protease 1) (Memapsin-1) (Down region aspartic protease) . 

GN BACE2 OR ASP21. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 0 057 170; PubMed=105912 13 ; 

RA Yan R., Bienkowski M.J., Shuck M.E., Miao H., Tory M.C., Pauley A.M., 

RA Brashier J.R., Stratman N.C., Mathews W.R., Buhl A. E . , Carter D.B., 

RA Tomasselli A.G., Parodi L.A., Heinrikson R.L., Gurney M.E.; 

RT "Membrane-anchored aspartyl protease with Alzheimer's disease 

RT beta-secretase activity."; 

RL Nature 402:533-537(1999). 



RN [2] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Bone marrow; 

RA Xin H., Stephans J.C., Duan X., Harrowe G. , Kim E. , Grieshammer U., 

RA Giese K. ; 

RT "Identification of a novel aspartic-like protease differentially 

RT expressed in human breast cancer cell lines."; 

RL Submitted (JAN-1999) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Accarino M.P., Fumagalli P., Ottolenghi S., Taramelli R. ; 

RT "Cloning of a gene from chromosome 21 Down region encoding a potential 

RT transmembrane aspartyl protease. "; 

RL Submitted (FEB-1998) to the EMBL/ GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Solans A., Estivill X., de la Luna S.; 

RT "Cloning of a novel mammalian aspartyl protease."; 

RL Submitted (AUG-1999) to the EMBL/ GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20120043; PubMed=10656250 ; 

RA Hussain I., Powell D.J., Howlett D.R., Tew D.G., Meek T.D., 

RA Chapman C, Gloger I.S., Murphy K.E., Southan CD., Ryan D.M., 

RA Smith T.S., Simmons D.L., Walsh F. S . ,\ Dingwall C, Christie G.; 

RT "Identification of a novel aspartic proteinase (Asp 2) as 

RT beta-secretase . " ; 

RL Mol. Cell. Neurosci. 14:419-427(1999). 

RN [6] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20144060; PubMed=106774 83 ; 

RA Lin X., Koelsch G., Wu S., Downs D., Dashti A., Tang J. ; 

RT "Human aspartic protease memapsin 2 cleaves the beta-secretase site of 

RT beta-amyloid precursor protein."; 

RL Proc. Natl. Acad. Sci. U.S.A. 97:1456-1460(2000). 

RN [7] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 02 89799; PubMed=10830953 ; 

RA Hattori M. , Fujiyama A., Taylor T.D., Watanabe H., Yada T., 

RA Park H.-S., Toyoda A. , Ishii K., Totoki Y . , Choi D.-K., Groner Y., 

RA Soeda E . , Ohki M. , Takagi T., Sakaki Y., Taudien S., Blechschmidt K., 

RA Polley A. , Menzel U., Delabar J., Kumpf K. , Lehmann R. , Patterson 

RA Reichwald K., Rump A., Schillhabel M. , Schudy A., Zimmermann W., 

RA Rosenthal A. , Kudoh J., Shibuya K., Kawasaki K., Asakawa S., 

RA Shintani A. f Sasaki T. f Nagamine K . , Mitsuyama S. f Antonarakis S.E., 

RA Minoshima S., Shimizu N., Nordsiek G. , Hornischer K., Brandt P. f 

RA Scharfe M. , Schoen O., Desario A., Reichelt J., Kauer G. , Bloecker H., 

RA Ramser J., Beck A., Klages S., Hennig S., Riesselmann L., Dagand E., 

RA Wehrmeyer S., Borzym K. , Gardiner K., Nizetic D., Francis F. f 

RA Lehrach H., Reinhardt R., Yaspo M.-L.; 

RT "The DNA sequence of human chromosome 21."; 

RL Nature 405:311-319(2000). 

RN [8] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Skin; 

RX MEDLINE-22388257; PubMed=12477 932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 



RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T. , Max S.I., Wang J . , Hsieh F. , 

RA Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M., Hong L. f 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P . , Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A.; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [9] 

RP CHARACTERI Z AT I ON . 

RX MEDLINE-22088158; PubMed-12093293 ; 

RA Turner R.T. Ill, Loy J. A., Nguyen C., Devasamudram T., Ghosh A. K. ,, 

RA Koelsch G. , Tang J.; 

RT "Specificity of memapsin 1 and its implications on the design of 

RT memapsin 2 (beta-secretase) inhibitor selectivity."; 

RL Biochemistry 41:8742-8746(2002). 

CC -!- CATALYTIC ACTIVITY: Broad endopeptidase specificity. Cleaves Glu- 

CC Val-Asn-Leu- | -Asp-Ala-Glu-Phe in the Swedish variant of 

CC Alzheimer's amyloid precursor protein. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF200342; AAF17078.1; -. 

DR EMBL; AF117892; AAD45240.1; -. 

DR EMBL; AF050171; AAD45963.1; -. 

DR EMBL; AF178532; AAF29494.1; 

DR EMBL; AF204944; AAF26368.1; 

DR EMBL; AF200192; AAF13714.1; -. 

DR EMBL; AL163284; CAB90458.1; 

DR EMBL; AL163285; CAB90554.1; 

DR EMBL; BC014453; AAH14453.1; -. 

DR HSSP; P00797; 2 REN . 

DR MEROPS; A01.041; -. 

DR Genew; HGNC:934; BACE2 . 

DR MIM; 605668; 

DR GO; GO: 0005624; C:membrane fraction; TAS . 

DR GO; GO: 0004190; F: aspartic-type endopeptidase activity; TAS. 

DR GO; GO:0006464; P:protein modification; TAS. 



DR 


GO; GO: 0009306; P:protein 


secretion; TAS . 


DR 


InterPro; 


IPR001969 ; Aspprotease_AS . 


DR 


InterPro; 


IPR009007; Pept 


A_ 


acid. 


DR 


InterPro; 


IPR001461; Peptida 


se Al . 


DR 


Pfam; PF00026; asp; 


1. 






DR 


PROSITE; 


PS00141; ASP PROTEASE; 2. 


KW 


Hydrolase 


; Aspartyl 


protease 


; Glycoprotein; Zymogen; Transmembr 


KW 


Signal . 










FT 


SIGNAL 


i 


20 




POTENTIAL. 


FT 


PROPEP 


21 






POTENTIAL. 


FT 


CHAIN 




518 




BETA SECRETASE 2. 


FT 


DOMAIN 


21 


473 




EXTRACELLULAR ( POTENTIAL) . 


FT 


TRANSMEM 


474 


494 




POTENTIAL. 


FT 


DOMAIN 


495 


518 




CYTOPLASMIC (POTENTIAL) . 


FT 


ACT SITE 


110 


110 




BY SIMILARITY. 


FT 


ACT SITE 


303 


303 




BY SIMILARITY. 


FT 


CARBOHYD 


170 


170 




N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


366 


366 




N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CONFLICT 


36 


36 




A -> T (IN REF . 6) . 


SQ 


SEQUENCE 


518 AA; 


56180 


MW; 2E903150823760D3 CRC64; 


Query Match 




100. 


0%; 


Score 2687; DB 1; Length 518; 



Best Local Similarity 100.0%; Pred. No. 2e-187; 

Matches 518; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MGAIARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRVVAPTPGPGT PAERHADGL 60 

| | | | | I I I I I I I I I I I II I I M I I I I I I I I I I I I M I I I I M I I I M I I I I I I I I I I I I I 
1 MGALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTPAERHADGL 60 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

M I I I I II I II I I I I I I II II I I I M I II I I I I I I I I I I I I I I I 

61 ALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAG 120 

121 TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATI 180 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I M I I I I I I I I I I I M 
121 TPHS YI DTYFDTERS STYRSKGFDVTVKYTQGSWTGFVGEDLVTI PKGFNTS FLVNIATI 180 

181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

M I I I M I I I II I I I II I II I I I I I I I I I II I I I I I I I I I I I I I I I I II I M I I I II I M 
181 FESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPVA 240 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

| | | | | | | | I I I II I I Mill I I II I I I II I I I I I I II I I I I I I I 

241 GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 300 

301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

| I I I I I I I I II I I I I I I I I I I I M M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
301 IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 360 

361 YLRDENS SRSFRITI LPQLYIQPMMGAGLNYECYRFGI S PSTNALVI GATVMEGFYVI FD 420 

| | II I I I I I I I I M I I I I I I I I I I II I II I I I I I I II I I I I I I I I I I I II I I I I I I I I I I 
361 YLRDENS SRSFRITI LPQLYIQPMMGAGLNYECYRFGI SPSTNALVI GATVMEGFYVI FD 420 

421 RAQ KRVG FAAS P CAE I AGAAVS E I S G P F S T EDVAS N CVP AQ S L S E P I LWI VS YALMS VC G 480 

| || | | || I I I I I I I I I I I I II I I I I I II I I M I I I I I I II M I II I I I I I I I I I II I I I I 

421 RAQ KRVG FAAS P CAE I AGAAVS E I S G P F S T EDVAS N CVP AQ S L S E P I LWI VS YALMS VC G 480 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AILLVLIVLLLLPFRCQRRPRDPEWNDESSLVRHRWK 518 



RESULT 2 
BACE RAT 



ID BACE_RAT STANDARD; PRT; 501 AA. 

AC P56819; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta-secretase precursor (EC 3.4.23.46) (Beta-site APP cleaving 

DE enzyme) (Beta-site amyloid precursor protein cleaving enzyme) 

DE (Aspartyl protease 2) (Asp 2) (ASP2) (Membrane-associated aspartic 

DE protease 2) (Memapsin-2) . 

GN BACE . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Crania ta; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20002972; PubMed-10531052 ; 

RA Vassar R. , Bennett B.D., Babu-Khan S., Kahn S., Mendiaz E.A., 

RA Denis P., Teplow D.B., Ross S., Amarante P., Loeloff R. , Luo Y., 

RA Fisher S., Fuller J., Edenson S., Lile J., Jarosinski M.A. , 

RA Biere A.L., Curran E. , Burgess T., Louis J.-C, Collins F. , 

RA Treanor J., Rogers G., Citron M. ; 

RT "Beta-secretase cleavage of Alzheimer's amyloid precursor protein by 

RT the transmembrane aspartic protease BACE."; 

RL Science 286:735-741(1999). 

CC -!- FUNCTION: Responsible for the proteolytic processing of the 

CC amyloid precursor protein (APP) . Cleaves at the amino terminus of 

CC the A-beta peptide sequence, between residues 671 and 672 of APP, 

CC leads to the generation and extracellular release of beta-cleaved 

CC soluble APP, and a corresponding cell-associated carboxy-terminal 

CC fragment which is later release by gamma-secretase (By 

CC similarity) . 

CC -!- CATALYTIC ACTIVITY: Broad endopeptidase specificity. Cleaves Glu- 

CC Val-Asn-Leu- | -Asp-Ala-Glu-Phe in the Swedish variant of 

CC Alzheimer f s amyloid precursor protein, 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; AF190727; AAF04144.1; -. 

DR HSSP; P32329; 1YPS . 

DR MEROPS; A01.004; 

DR InterPro; IPR001969; Aspprotease_AS . 



DR 


InterPro; 


IPR009007; Pept 


A acid. 




DR 


InterPro; 


IPR001461; Peptidase_Al. 




DR 


Pfam; PF00026; asp; 


1. 






DR 


PRINTS; PR00792; PEPSIN. 






DR 


PROSITE; 


PS00141; ASP PROTEASE; 1. 




KW 


Hydrolase 


; Aspartyl 


protease; Glycoprotein; Zymogen; 


Transmembrane ; 


KW 


Signal . 










FT 


SIGNAL 


1 


21 


POTENTIAL. 




FT 


PROPEP 


22 


45 


POTENTIAL. 




FT 


CHAIN 


46 


501 


BETA-SECRETASE. 




FT 


DOMAIN 


22 


457 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


458 


478 


POTENTIAL. 




FT 


DOMAIN 


479 


501 


CYTOPLASMIC (POTENTIAL) 


• 


FT 


ACT_SITE 


93 


93 


BY SIMILARITY. 




FT 


ACT SITE 


289 


289 


BY SIMILARITY. 




FT 


DISULFID 


216 


420 


BY SIMILARITY. 




FT 


DISULFID 


278 


443 


BY SIMILARITY. 




FT 


DISULFID 


330 


380 


BY SIMILARITY. 




FT 


CARBOHYD 


153 


153 


N-LINKED (GLCNAC. . .) 


( POTENTIAL) . 


FT 


CARBOHYD 


172 


172 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


223 


223 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


354 


354 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


SQ 


SEQUENCE 


501 AA; 


55806 


MW; 24B445BC8BE87DE3 CRC64; 



Query Match 44.2%; Score 1187; DB 1; Length 501; 

Best Local Similarity 46.4%; Pred. No. 1.2e-78; 

Matches 240; Conservative 82; Mismatches 165; Indels 30; Gaps 9; 

Qy 9 LLP LLAQWLLRAAP ELAP AP FT L P L R VAAAT N RWA P T P G P — GTPAERHADGLA 61 

: I I II : I I I I I I I I III II 

Db 1 MAPALRWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDE — 49 

Qy 62 LALEPALAS PAGAANFLAMVX)NLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGT 121 

II I : I : I I I I I : I I I : I I I : I I : I : I I I I I I I I I I M I I I I 

Db 50 EP — EEPGRRGS FVEMVDNLRGKS GQGYYVEMTVGS P PQTLNI LVDTGS SNFAVGAA 104 

Qy 122 PHSYIDTYFDTERSSTYRSKGFDWVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIF 181 

II:: I : : I I I I I I I I I I I I I : I I I I : I I I I : I I I I 

Db 105 PHPFLHRYYQRQLS ST YRDLRKS VYVP YTQGKWEGELGTDLVS I PHGPNVTVRANI AAIT 164 

Qy 182 ESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV — 239 

II: II: I I I I I I I I I I : I : I III I I I I I I I : I M : I I : I : I I I I I : 

Db 165 ESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQ 224 

Qy 240 -AGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNAD 298 

: I I I : : : I II : III I : II I I I : I I I I : : I : : : I I II I : I I : I I I I 

Db 225 TEALASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVTIVRVEINGQDLKMDCKEYNYD 284 

Qy 299 KAIVDSGTTLLRLPQKVFDAWEAV7VRASLIPEFSDGFWTGSQLACWTNSETPWSYFPKI 358 

I : I II I I I I I I I I : I I I : I I : : : II : I I I I I I I I I I I I I : II I 

Db 285 KS I VDSGTTNLRLPKKVFEAAVKS I KAAS STEKFPDGFWLGEQLVCWQAGTTPWNI FPVI 344 

Qy 359 SIYLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGFYV 417 

I : I I | : : : | | M I I I I I I : : I : : : I I : I : I I : I : I I : I I I I I I 

Db 345 SLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYV 404 



Qy 418 I FDRAQKRVGFAAS P CAE I AGAAVS E I S G P FSTEDVASNCVPAQ S L S E P I LW I VS YALMS 477 



•1111*11*11111 * * I I I I I 1 1 

4 05 VFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTADMEDCGYNIPQTDESTLMTIAYVMAA 464 

478 VCGAILLVLIVLLLLPFRCQR— RPRDPEWNDESSL 512 

: I I : : : : I : : : I I I I : : :| I I 

4 65 IC-ALFMLPLCLMVCQWRCLRCLRHQHDDFADDISLL 500 



RESULT 3 
BACE MOUSE 



ID BACE_MOUSE STANDARD; PRT; 501 AA. 

AC P56818; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta-secretase precursor (EC 3.4.23.46) (Beta-site APP cleaving 

DE enzyme) (Beta-site amyloid precursor protein cleaving enzyme) 

DE (Aspartyl protease 2) (Asp 2) (ASP2) (Membrane-associated aspartic 

DE protease 2) (Memapsin-2) . 

GN BACE . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20002972; PubMed=10531052 ; 

RA Vassar R. , Bennett B.D., Babu-Khan S., Kahn S., Mendiaz E.A. , 

RA Denis P., Teplow D.B., Ross S., Amarante P., Loeloff R. , Luo Y., 

RA Fisher S. f Fuller J. , Edenson S., Lile J., Jarosinski M.A. , 

RA Biere A.L., Curran E . , Burgess T., Louis J.-C, Collins F. , 

RA Treanor J. , Rogers G., Citron M. ; 

RT "Beta-secretase cleavage of Alzheimer f s amyloid precursor protein by 

RT the transmembrane aspartic protease BACE."; 

RL Science 286:735-741(1999). 

RN [2] 

RP REVISIONS TO 6 AND 81-87. 

RA Bennett B.D., Vassar R. , Citron M. ; 

RL Submitted (JAN-2000) to the EMBL/ GenBank/ DDB J databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=2 0057170; PubMed= 10591213; 

RA Yan R. , Bienkowski M.J., Shuck M.E., Miao H. , Tory M.C., Pauley A.M., 

RA Brashier J.R., Stratman N.C., Mathews W.R., Buhl A.E., Carter D.B., 

RA Tomasselli A.G., Parodi L.A., Heinrikson R.L., Gurney M.E.; 

RT "Membrane-anchored aspartyl protease with Alzheimer's disease 

RT beta-secretase activity."; 

RL Nature 402:533-537(1999). 

CC -!- FUNCTION: Responsible for the proteolytic processing of the 

CC amyloid precursor protein (APP) . Cleaves at the amino terminus of 

CC the A-beta peptide sequence, between residues 671 and 672 of APP, 

CC leads to the generation and extracellular release of beta-cleaved 

CC soluble APP, and a corresponding cell-associated carboxy-terminal 

CC fragment which is later release by gamma-secretase (By 

CC similarity) . 

CC -!- CATALYTIC ACTIVITY: Broad endopeptidase specificity. Cleaves Glu- 
CC Val-Asn-Leu- | -Asp-Ala-Glu-Phe in the Swedish variant of 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 



| _ 
1 - 
i _ 



Alzheimer f s amyloid precursor protein. 
SUBCELLULAR LOCATION: Type I membrane protein 
TISSUE SPECIFICITY: Brain. 

SIMILARITY: Belongs to peptidase family Al. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; AF190726; AAF04143.2; 

EMBL; AF200346; AAF17082.1; -. 

HSSP; P56272; LAMS. 

MEROPS; A01.004; 

MGD; MGI: 1346542; Bace. 

InterPro; IPR001969; Aspprotease_AS . 

InterPro; IPR009007; Pept_A_acid. 

InterPro; IPR001461; Peptidase_Al . 

Pfam; PF00026; asp; 1. 

PRINTS; PR00792; PEPSIN. 

PROSITE; PS00141; ASP__PROTEASE; 1. 

Hydrolase; Aspartyl protease; Glycoprotein; Zymogen; Transmembrane; 



KW 


Signal . 








FT 


SIGNAL 


1 


21 


POTENTIAL. 


FT 


PROPEP 


22 


45 


POTENTIAL. 


FT 


CHAIN 


46 


501 


BETA-SECRETASE . 


FT 


DOMAIN 


22 


457 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


458 


478 


POTENTIAL. 


FT 


DOMAIN 


479 


501 


CYTOPLASMIC ( POTENTIAL) . 


FT 


ACT SITE 


93 


93 


BY SIMILARITY. 


FT 


ACT_SITE 


289 


289 


BY SIMILARITY. 


FT 


DISULFID 


216 


420 


BY SIMILARITY. 


FT 


DISULFID 


278 


443 


BY SIMILARITY. 


FT 


DISULFID 


330 


380 


BY SIMILARITY. 


FT 


CARBOHYD 


153 


153 


N-LINKED (GLCNAC. . . ) (POTENTIAL) 


FT 


CARBOHYD 


172 


172 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


223 


223 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


354 


354 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


SQ 


SEQUENCE 


501 AA; 


55747 MW; C085A013145E474E CRC64 ; 


Query Match 




44.1%; 


Score 1185; DB 1; Length 501; 




Best Local Similarity 


46.0%; 


Pred. No. 1.7e-78; 


Matches 237 ; 


Conservative 


83; Mismatches 169; Indels 26; 



Gaps 



7; 



Qy 

Db 

Qy 

Db 

Qy 



9 LLPLLAQWLLRAAPELAPAPFT L P L RVAAATN RVVAP T P G P GT PAERHADGLALA 63 

: I I II : II I I I I I I II II : 

1 MAPALHWLLLWVGSGMLPAQGTHLGIRLPLRSGLA GPPLGLRLPRETDEES 51 

64 LEPALASPAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTPH 123 

I : I : I I II I : I I I : I I I : I I : I : I I I I I I I I I I I I I I I I II 

52 EEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPH 106 

124 SYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFES 183 
: : I : : I I I I I I I I II I I I : I I I I : I I I I : I I I I I I 



Db 


107 


P FLHRYYQRQLS STYRDLRKGVYVP YTQGKWEGELGTDLVS I PHGPNVTVRANI AAITES 


166 


Qy 


184 


ENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV A 

• 11* 1 I 1 1 1 t 1 1 1 1 -1:1 Ml 1 1 1 1 1 1 1 : 1 1 1 : 1 1 : 1 : 1 1 1 1 1 : 

DKFFINGSNWEGILGIAYAEIARPDDSLEPFFDSLVKQTHIPNIFSLQLCGAGFPLNQTE 


240 


Db 


167 


226 


Qy 


241 


GSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKA 

- 1 i 1 • • - 1 I 1 • Ml 1 : 1 1 1 II : MM:: 1 : : : || II I : I 1 : 1 II II : 

ALASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKS 


300 


Db 


227 


286 


Qy 


301 


IVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISI 

II 1 1 1 11 II II : M 1 : 1 1 : : : II M MM 1 II II MM- II 1 M 
i i i i i i i iiii*iii*i i • • • ii > iiii i ii ii iii ii 

IVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISL 


360 


Db 


287 


346 


Qy 


361 


YLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGFYVIF 

|| | •■■lllllllll 1 • • 1 • * • 1 I • 1 "1 1" 1 II llllll i 

YLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAVSQSSTGTVMGAVIMEGFYWF 


419 


Db 


347 


406 


Qy 


420 


DRAQKRVGFAAS P CAE I AGAAVS EISGPFSTE DVASN CVP AQ S L S E P I LW I VS YALMS VC 

1 II : 1 1 : II 1 1 1 : : | II 1 1 : 1 1 : : 1" : : : 1 
DRARKRI G FAVS ACHVHDE FRTAAVE GP FVT ADME D CG YN I P QT D E S T LMT I AYVMAAI C 


479 


Db 


407 


466 


Qy 


480 


GAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 




Db 


467 


| : : : : I : : : II 1 1 : : Mil 

-ALFMLPLCLMVCQWRCLRCLRHQHDDFADDI SLL 500 





RESULT 4 
BAC E_HUMAN 

ID BAC E_HUMAN STANDARD; PRT; 501 AA. 

AC P56817; Q9BYB9; Q9BYC0; Q9BYC1; Q9UJT5; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Beta-secretase precursor (EC 3.4.23.46) (Beta-site APP cleaving 

DE enzyme) (Beta-site amyloid precursor protein cleaving enzyme) 

DE (Aspartyl protease 2) (Asp 2) (ASP2) (Membrane-associated aspartic 

DE protease 2) (Memapsin-2) . 

GN BACE OR BACEl . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Crania ta; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM A) . 

RC TISSUE=Brain; 

RX MEDLINE=20002972; PubMed=l 053 1052 ; 

RA Vassar R. , Bennett B.D., Babu-Khan S., Kahn S., Mendiaz E.A., 

RA Denis P., Teplow D.B., Ross S., Amarante P., Loeloff R. , Luo Y., 

RA Fisher S., Fuller J., Edenson S., Lile J., Jarosinski M.A., 

RA Biere A.L., Curran E. , Burgess T . , Louis J.-C, Collins F. , 

RA Treanor J., Rogers G., Citron M. ; 

RT "Beta-secretase cleavage of Alzheimer T s amyloid precursor protein by 

RT the transmembrane aspartic protease BACE."; 

RL Science 286:735-741(1999). 
RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM A), SEQUENCE OF 46-68, AND 



RP CHARACTERIZATION . 

RC TISSUE=Brain; 

RX MEDLINE=20057171; PubMed=105912 14 ; 

RA Sinha S., Anderson J* P., Barbour R. , Basi G.S., Caccavello R. , 

RA Davis D., Doan M. , Dovey H.F., Frigon N., Hong J. , Jacobson-Croak K., 

RA Jewett N., Keim P., Knops J., Lieberburg I., Power M., Tan H., 

RA Tatsuno G., Tung J., Schenk D., Seubert P., Suomensaari S.M., Wang S., 

RA Walker D., Zhao J., McConlogue L-, Varghese J.; 

RT "Purification and cloning of amyloid precursor protein beta-secretase 

RT from human brain."; 

RL Nature 402:537-540(1999). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORM A) . 

RX MEDLINE=2 0057170; PubMed^l 0591213; 

RA Yan R. , Bienkowski M.J,, Shuck M.E., Miao H . , Tory M.C., Pauley A.M., 

RA Brashier J.R., Stratman N.C., Mathews W.R., Buhl A.E., Carter D.B., 

RA Tomasselli A.G., Parodi L.A. , Heinrikson R.L., Gurney M.E.; 

RT "Membrane-anchored aspartyl protease with Alzheimer's disease beta- 

RT secretase activity."; 

RL Nature 402:533-537(1999). 

RN [4] 

RP SEQUENCE FROM N.A. (ISOFORM A) . 

RX ' MEDLINE=20120043; PubMed=10 656250 ; 

RA Hussain I., Powell D.J., Howlett D.R., Tew D.G., Meek T.D., 

RA Chapman C, Gloger I.S., Murphy K.E., Southan CD., Ryan D.M., 

RA Smith T.S., Simmons D.L., Walsh F.S., Dingwall C, Christie G. ; 

RT "Identification of a novel aspartic proteinase (Asp 2) as beta- 

RT secretase."; 

RL Mol. Cell. Neurosci. 14:419-427(1999). 

RN [5] 

RP SEQUENCE FROM N.A. (ISOFORM B) . 

RC TISSUE=Brain, and Pancreas; 

RA Michel B., De Pietri Tonelli D., Zacchetti D., Keller P.; 

RT "New beta-site APP cleaving enzyme isoform (BACE-1B) obtained from 

RT human brain and pancreas."; 

RL Submitted (JAN-2001) to the EMBL/GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. (ISOFORM C) . 

RC TISSUE^Pancreas; 

RA Zacchetti D., De Pietri Tonelli D., Schnurbus R. ; 

RT "New beta-site APP cleaving enzyme isoform (BACE-1C) obtained from 

RT human pancreas."; 

RL Submitted (JAN-2001) to the EMBL/GenBank/DDBJ databases. 

RN [7] 

RP SEQUENCE FROM N.A. (ISOFORMS B; C AND D) . 

RC TISSUE=Brain; 

RX MEDLINE=2 14 08467; PubMed=l 1 516562; 

RA Tanahashi H., Tabira T.; 

RT "Three novel alternatively spliced isoforms of the human beta-site 

RT amyloid precursor protein cleaving enzyme (BACE) and their effect on 

RT amyloid beta-peptide production."; 

RL Neurosci. Lett. 307:9-12(2001). 

RN [8] 

RP SEQUENCE OF 14-501 FROM N.A. (ISOFORM A) f AND CHARACTERIZATION . 

RX MEDLINE=2 0144 060; PubMed= 10677483; 

RA Lin X.,'Koelsch G., Wu S., Downs D., Dashti A., Tang J. ; 

RT "Human aspartic protease memapsin 2 cleaves the beta-secretase site of 



RT beta-amyloid precursor protein."; 

RL Proc. Natl. Acad. Sci. U.S.A. 97:1456-1460(2000). 

RN [9] 

RP DISULFIDE BONDS. 

RX MEDLINE=2 19508 60; PubMed=11953458 ; 

RA Fischer F. , Molinari M. , Bodendorf U., Paganetti P.; 

RT "The disulphide bonds in the catalytic domain of BACE are critical but 

RT not essential for amyloid precursor protein processing activity."; 

RL J. Neurochem. 80:1079-1088(2002). 

CC -!- FUNCTION: Responsible for the proteolytic processing of the 

CC amyloid precursor protein (APP) . Cleaves at the amino terminus of 

CC the A-beta peptide sequence, between residues 671 and 672 of APP, 

CC leads to the generation and extracellular release of beta-cleaved 

CC soluble APP, and a corresponding cell-associated carboxy-terminal 

CC fragment which is later release by gamma-secretase . 

CC -!- CATALYTIC ACTIVITY: Broad endopeptidase specificity. Cleaves Glu- 

CC Val-Asn-Leu- | -Asp-Ala-Glu-Phe in the Swedish variant of 

CC Alzheimer's amyloid precursor protein. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event ^Alternative splicing; Named isoforms=4; 

CC Name=A; Synonyms=BACE-lA, BAC-501; 

CC IsoId=P56817-l; Sequence=Displayed; 

CC Name=B; Synonyms =BACE- IB, BACE-I-476; 

CC IsoId=P56817-2; Sequence=VSP_005223 ; 

CC Name=C; Synonyms =BACE-1C, BACE-I-457; 

CC IsoId=P56817-3; Sequence=VSP_005222 ; 

CC Name=D; Synonyms=BACE-lD, BACE-I-432; 

CC IsoId=P56817-4; Sequence=VSP_005222, VSPJD05223; 

CC -!- TISSUE SPECIFICITY: Brain. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF190725; AAF04142.1; -. 

DR EMBL; AF201468; AAF18982.1; -. 

DR EMBL; AF200343; AAF17079.1; -. 

DR EMBL; AF204943; AAF26367.1; -. 

DR EMBL; AF338816; AAK38374. 1; 

DR EMBL; AF338817; AAK38375.1; -. 

DR EMBL; AB050436; BAB40931.1; 

DR EMBL; AB050437; BAB40932.1; -. 

DR EMBL; AB050438; BAB4 0933.1; 

DR EMBL; AF200193; AAF13715.1; -. 

DR PIR; A59090; A59090. 

DR PDB; 1M4H; 28-AUG-02. 

DR MEROPS; A01.004; -. 

DR Genew; HGNC:933; BACE. 

DR MIM; 604252; 

DR GO; GO: 0005887; C: integral to plasma membrane; TAS . 

DR GO; GO:0008798; F: beta-aspartyl-peptidase activity; TAS., 



DR 


GO; GO: 0009405; P 


: pathogenesis ; TAS. 


DR 


GO; GO: 0006508; P 


proteolysis and peptidolysis ; TAS. 


DR 


InterPro ; 


IPR001969; Aspprotease_AS . 


DR 


InterPro; 


IPR009007; Pept A 


acid. 


DR 


InterPro ; 


IPR001461; Peptidase Al . 


DR 


Pfam; PF00026; asp; 1. 




DR 


PRINTS; PR00792; 


PEPSIN. 




DR 


PROSITE; 


PS00141; 


ASP PROTEASE; 1. 


KW 


Hydrolase 


; Aspartyl protease 


; Glycoprotein; Zymogen; Transmembrane; 


KW 


Signal; Alternative splicing 


; 3D-structure . 


FT 


SIGNAL 


1 


21 


POTENTIAL. 


FT 


PROPEP 


22 


45 




FT 


CHAIN 


46 


501 


BETA-SECRETASE. 


FT 


DOMAIN 


22 


457 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


458 


478 


POTENTIAL . 


FT 


DOMAIN 


479 


501 


CYTOPLASMIC (POTENTIAL) . 


FT 


ACT_SITE 


93 


93 


BY SIMILARITY. 


FT 


ACT_SITE 


289 


289 


BY SIMILARITY. 


FT 


DISULFID 


216 


420 




FT 


DISULFID 


278 


443 




FT 


DISULFID 


330 


380 




FT 


CARBOHYD 


153 


153 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


172 


172 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


223 


223 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


354 


354 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


146 


189 


Missing (in isoform C and isoform D) . 


FT 








/FTId=VSP_005222 . 


FT 


VARSPLIC 


190 


214 


Missing (in isoform B and isoform D) . 


FT 








/FTId=VSP_005223. 


SQ 


SEQUENCE 


501 AA 


55763 MW 


377CE4C824ACEF05 CRC64 ; 



Query Match 43.9%; Score 1178.5; DB 1; Length 501; 

Best Local Similarity 46.2%; Pred. No. 5.1e-78; 

Matches 24 0; Conservative 82; Mismatches 164; Indels 33; Gaps 9; 

Qy 7 ALLLPLLAQWLLRAAPELAPAPFT L P LRVAAATN RWAP T P G P GT P AE RHAD GLA 61 

I I I I I : : I I I I I I I II II 
Db 2 AQALPWLLLWM GAGVLPAHGTQHGIRLPLRSGLG GAPL GLR 42 

Qy 62 LALE — PALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVA 119 

I I I : I : I I I I I : I I I : I I I : I I : I : I I I I I I I I I I I I I I I I 

Db 43 LPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVG 102 



Qy 120 GTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT 179 

II:: I : : I I I I I I I M I I I I : I I I I : I I I I : III 

Db 103 7\APHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAA 162 

Qy 180 IFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANIPNVFSMQMCGAGLPV 239 

I II: II: I I I I I I I I I I : I : I III I I I I I I I : : I I : I I : I : I I I I I : 

Db 163 ITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPL 222 



Qy 240 AGS GTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYN 296 

I : I I I : : : I I I : III I : I I I I I : MM:: I : : : I I II I : I I : I I I 

Db 223 NQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYN 282 



Qy 297 ADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFP 356 

I I : I I I I I I I I I II : I I I : I I :: : II : I I I I I I I I I I I I I : II 



Db 



283 YDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFP 342 



Qy 357 KISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY-ECYRFGISPSTNALVIGATVMEGF 415 

I I : I I I : : : I I I II II I I I : : I : : : I I : I I I I : I'll -Mil 

Db 343 VISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGF 402 

Qy 416 YVI FD RAQKRVG FAAS P CAE I AGAAVS E I S G P FS T ED VASN CVP AQ S L S E P I LW I VS YAL 475 

I I : I I I I : I I : I II I I : : I II I I : I I —I : 

Db 403 YWFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVM 462 

Qy 476 MSVCGAILLVLIVLLLLPFRCQR — RPRDPEWNDESSL 512 

: : I I : : : : I : : : I I I I : : : I I I 
Db 463 AAIC-ALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL 500 



RESULT 5 
PEPC_MACFU 

ID PEPC_MACFU STANDARD; PRT; 377 AA. 

AC P03955; 

DT 23-OCT-1986 (Rel. 02, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Gastricsin precursor (EC 3.4.23.3) (Pepsinogen C) (Fragment). 

GN PGC. 

OS Macaca fuscata fuscata (Japanese macaque) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Cercopithecidae; 

OC Cercopithecinae ; Macaca. 

OX NCBI_TaxID=9543 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Gastric mucosa; 

RX MEDLINE=9 2 037 645; PubMed= 1935977; 

RA Kageyama T., Tanabe K., Koiwai 0.; 

RT "Development-dependent expression of isozymogens of monkey 

RT pepsinogens and structural differences between them."; 

RL Eur. J. Biochem. 2 02:205-215(1991). 

RN [2] 

RP SEQUENCE OF 6-377. 

RX MEDLINE=86168133; PubMed=3514597 ; 

RA Kageyama T., Takahashi K. ; 

RT "The complete amino acid sequence of monkey progastricsin . " ; 

RL J. Biol. Chem. 261:4406-4419(1986). 

RN [3] 

RP SEQUENCE OF 6-65. 

RX MEDLINE=8 52 8 9106; PubMed=3 92 8 607; 

RA Kageyama T., Takahashi K. ; 

RT "Monkey pepsinogens and pepsins. VII. Analysis of the activation 

RT process and determination of the NH2-terminal 60-residue sequence of 

RT Japanese monkey progastricsin, and molecular evolution of 

RT pepsinogens."; 

RL J. Biochem. 97:1235-1246(1985). 

CC -!- CATALYTIC ACTIVITY: More restricted specificity than pepsin A, but 
CC shows preferential cleavage at Tyr- | -Xaa bonds; high activity 

CC towards hemoglobin as substrate. 

CC -!- PTM: Each pepsinogen is converted to corresponding pepsin at pH 
CC 2 . 0 in part as a result of the release of a 47 aa activation 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 



_ i _ 



_ i _ 



segment and in part as a result of stepwise proteolytic cleavage 
via an intermediate form(s). 

MISCELLANEOUS: The expression of pepsinogen genes is regulated by 
hormones and related substances. 
SIMILARITY: Belongs to peptidase family Al . 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



EMBL; X59754; CAA42426.1; 
PIR; S19683; PEMQCJ. 
HSSP; P20142; 1AVF. 
MEROPS; A01.003; 

InterPro; IPR001969; Aspprotease_AS . 
InterPro; IPR009007; Pept_A_acid. 
InterPro; IPR001461; Peptidase Al . 



DR 


Pfam; PF00026; asp; 


1. 




DR 


PRINTS; PR00792; PEPSIN. 




DR 


PROSITE; 


PS00141; ASP PROTEASE; 2. 


KW 


Hydrolase 


; Aspartyl 


protease 


; Zymogen; Digestion; Signal 


FT 


NON_TER 


1 


1 




FT 


S I GN AL 


<1 


5 




FT 


PROPEP 


6 


31 


ACTIVATION PEPTIDE. 


FT 


PROPEP 


32 


48 


ACTIVATION PEPTIDE. 


FT 


CHAIN 


49 


377 


GASTRICSIN. 


FT 


DISULFID 


93 


98 




FT 


DISULFID 


256 


260 




FT 


DISULFID 


299 


332 




FT 


ACT SITE 


80 


80 




FT 


ACT_SITE 


265 


265 




FT 


CONFLICT 


331 


331 


Y -> V (IN REF. 2) . 


FT 


CONFLICT 


349 


349 


L -> LVY (IN REF. 2) . 


SQ 


SEQUENCE 


377 AA; 


41148 MW 


; 2CFB8F8BF26D77CE CRC64; 



Query Match 13.5%; Score 363.5; DB 1; Length 377; 

Best Local Similarity 28.9%; Pred. No. 4.5e-19; 

Matches 118; Conservative 65; Mismatches 118; Indels 107; Gaps 19; 



Qy 

Db 



56 HADGLALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLI GTPPQKLQI LVDTGS SN 115 

I I : : : I I : I : I I : I : I I I I I I : I I I I I I I 

44 HFGDLSVSYEP MAYMD AAYFGEISIGTPPQNFLVLFDTGSSN 85 



QY 
Db 



116 FAV AGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPK 167 

I I I I I I : II I I : I : : : I I I I I I I I : I : 

86 LWVPSVYCQSQACTSHS RFNPSESSTYSTNGQTFSLQYGSGSLTGFFGYDTLTV — 139 



Qy 

Db 

Qy 



168 GFNTSFLVNIATIFESENFFLPG IKWNGILGLAYATLAKPSSSLETFFDSLVTQA 222 

I I III II : : : I I : I II I I I : : : I : I : 

140 QSIQVPNQEFGLSEN — E P GTN FVYAQ FD G I MGLAY P T L S VD GAT — TAMQGMVQEG 192 

223 NIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKL 281 
: : : I I : : I : : I I : : I II : : I I I I I : : I : : I I : I I I : 



Db 



193 ALTSPIFSVYLSDQ QGSSGGAWFGGVDSSLYTGQIYWAPVTQELYWQIGIEEF 246 



Qy 282 EIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQ 341 

INI: II : I I I I : I I : I I : I I : I : : : I I I : I 

Db 247 LIGGQASGW-CSE — GCQAI VDT GT S LLTVPQQ YMS ALLQA TGAQ 28 8 

Qy 342 LACWTNSETPWSYF PKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY 391 

I : I I : : : : I II II 

Db 289 EDEYGQFLVNCNSIQNLPTLTFII NGVEFPLPPSSYI LNN 328 

Qy 392 ECY-RFGISP STNALVI GATVMEGFYVI FDRAQKRVGFAAS 431 

I I : I I : : I : : I : : I : I I I I I : 

Db 329 NGYCTVGVEPTYLSAQNSQPLWILGDVFLRSYYSVYDLSNNRVGFATA 376 



RESULT 6 
PEPC_HUMAN 

ID PEPC_HUMAN STANDARD; PRT; 38 8 AA. 

AC P20142; 

DT 01-FEB-1991 (Rel. 17, Created) 

DT 01-FEB-1991 (Rel. 17, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Gastricsin precursor (EC 3.4.23.3) (Pepsinogen C) . 

GN PGC . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88087276; PubMed-333554 9 ; 

RA Hayano T., Sogawa K. , Ichihara Y., Fu j ii-Kuriyama Y. , Takahashi K. ; 

RT "Primary structure of human pepsinogen C gene."; 

RL J. Biol. Chem. 263:1382-1385(1988). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=8 907 9679; PubMed=2 909526; 

RA Taggart R.T., Cass L.G., Mohandas T.K., Derby P., Barr P. J., Pals G. , 

RA Bell G.I. ; 

RT "Human pepsinogen C (progastricsin) . Isolation of cDNA clones, 

RT localization to chromosome 6, and sequence homology with pepsinogen 

RT A."; 

RL J. Biol. Chem. 264:375-379(1989). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta; 

RX MEDLINE= 89290840; PubMed=2 567697; 

RA Pals G., Azuma T., Mohandas T.K., Bell G.I., Bacon J., 

RA Samloff I.M., Walz D.A., Barr P.J., Taggart R.T.; 

RT "Human pepsinogen C (progastricsin) polymorphism: evidence for a 

RT single locus located at 6p21 . 1-pter . " ; 

RL Genomics 4:137-148(1989). 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Wong R.N.S., Tang J.; 

RL Submitted (NOV-1996) to the EMBL/ GenBank/DDB J databases. 

RN [5] 



RP SEQUENCE OF 17-101. 

RX MEDLINE-90130402; PubMed=2515193; 

RA Athauda S.B.P., Tanji M. , Kageyama T., Takahashi K. ; 

RT "A comparative study on the NH2-terminal amino acid sequences and 

RT some other properties of six isozymic forms of human pepsinogens and 

RT pepsins."; 

RL J. Biochem. 106:920-927(1989). 

RN [6] 

RP SEQUENCE OF 17-64. 

RX MEDLINE=83079318; PubMed=6816595; 

RA Foltmann B., Jensen A.L.; 

RT "Human progastricsin. Analysis of intermediates during activation 

RT into gastricsin and determination of the amino acid sequence of the ^ 

RT propart." ; 

RL Eur. J. Biochem. 128:63-70(1982). 

RN [7] 

RP X-RAY CRYSTALLOGRAPHY (1.62 ANGSTROMS). 

RX MEDLINE=95230687; PubMed-77 14 902 ; 

RA Moore S.A., Sielecki A.R., Chernaia M.M. , Tarasova N.I., James M.N.G.; 

RT "Crystal and molecular structures of human progastricsin at 1.62-A 

RT resolution."; 

RL J. Mol. Biol. 247:466-485(1995). 

RN [8] 

RP X-RAY CRYSTALLOGRAPHY (2.36 ANGSTROMS). 

RX MEDLINE=9806964 9; PubMed=94 06551; 

RA Khan A.R., Cherney M.M., Tarasova N.I., James M.N.; 

RT "Structural characterization of activation 'intermediate 2* on the 

RT pathway to human gastricsin."; 

RL Nat. Struct. Biol. 4:1010-1015(1997). 

CC -!- CATALYTIC ACTIVITY: More restricted specificity than pepsin A, but 

CC shows preferential cleavage at Tyr- | -Xaa bonds; high activity 

CC towards hemoglobin as substrate. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M18667; AAA60062.1; ALT_INIT . 

DR EMBL; M18659; AAA60062.1; JOINED. 

DR EMBL; M18660; AAA60062.1; JOINED. 

DR EMBL; M18661; AAA60062 . 1 ; JOINED. 

DR EMBL; M18662; AAA60062.1; JOINED. 

DR EMBL; M18663; AAA60062.1; JOINED. 

DR EMBL; M18664; AAA60062.1; JOINED. 

DR EMBL; M18665; AAA60062.1; JOINED. 

DR EMBL; M18666; AAA60062.1; JOINED. 

DR EMBL; M23077; AAA60063.1; -. 

DR EMBL; M23069; AAA60063.1; JOINED. 

DR EMBL; M23070; AAA60063.1; JOINED. 

DR EMBL; M23071; AAA60063.1; JOINED. 

DR EMBL; M23072; AAA60063.1; JOINED. 

DR EMBL; M23073; AAA60063.1; JOINED. 



DR 


EMBL; M23074 


; AAA60063. 


1; JOINED. 


DR 


EMBL; M23075 


; AAA60063. 


1; .JOINED. 


DR 


EMBL; J04443 


; AAA60074. 


1; 




DR 


EMBL; U75272 


; AAB18273. 


1; -. 




DR 


PIR; A29937; 


A29937. 






DR 


PDB; 1HTR; 2 


6-JAN 


-95. 






DR 


PDB; 1AVF; 2 


5 -FEB 


-98. 






DR 


MEROPS; A01. 


003; 


* 






DR 


Genew; HGNC: 


8890; 


PGC. 






DR 


MIM; 169740; 


■ 








DR 


GO; GO: 00056 


15; C 


: extracellular space ; 


DR 


GO; GO: 00041 


90; F 


: aspartic-type endopep 


DR 


GO; GO: 00075 


86; P 


: digestion; 


TAS. 


DR 


GO; GO: 00065 


08; P 


proteolysis and pepti 


DR 


InterPro; IPR001969; Aspprotease_AS . 


DR 


InterPro; IPR009007; Pept_A_ 


acid. 


DR 


InterPro; IPR001461; Peptidase_Al . 


DR 


Pfam; PF00026; asp; 1. 






DR 


PRINTS; PR00792; 


PEPSIN 






DR 


PROSITE; PS00141; 


ASP PROTEASE; 2. 


KW 


Hydrolase; Aspartyl protease 


; Zymogen; 


KW 


3D-structure 


* 








FT 


SIGNAL 


1 


16 






FT 


PROPEP 


17 


59 




ACTIVATION 


FT 


CHAIN 


60 


388 




GASTRICSIN 


FT 


ACT_SITE 


91 


91 






FT 


ACT_SITE 


276 


276 






FT 


DISULFID 


104 


109 






FT 


DISULFID 


267 


271 






FT 


DISULFID 


310 


343 






FT 


CONFLICT 


40 


41 




GE -> ED ( 


FT 


CONFLICT 


52 


52 




W -> S (IN 


FT 


STRAND 


19 


25 






FT 


HELIX 


29 


35 






FT 


TURN 


36 


37 






FT 


HELIX 


39 


43 






FT 


TURN 


44 


45 






FT 


HELIX 


50 


54 






FT 


HELIX 


65 


68 






FT 


TURN 


69 


70 






FT 


STRAND 


73 


79 






FT 


TURN 


80 


83 






FT 


STRAND 


84 


91 






FT 


TURN 


92 


93 






FT 


STRAND 


97 


101 






FT 


TURN 


102 


103 






FT 


HELIX 


107 


110 






FT 


TURN 


111 


111 






FT 


STRAND 


115 


115 






FT 


HELIX 


117 


119 






FT 


TURN 


121 








FT 


STRAND 


124 


134 






FT 


TURN 


135 


136 






FT 


STRAND 


137 


150 






FT 


TURN 


151 


152 






FT 


STRAND 


153 


163 






FT 


HELIX 


169 


173 







PEPTIDE. 



N REF. 6) 
REF. 6) . 



FT 


STRAND 


178 


181 


FT 


TURN 


190 


191 


FT 


HELIX 


195 


201 


FT 


TURN 


202 


203 


FT 


STRAND 


209 


214 


FT 


STRAND 


221 


227 


FT 


HELIX 


232 


234 


FT 


STRAND 


235 


244 


FT 


STRAND 


251 


254 


FT 


STRAND 


256 


259 


FT 


TURN 


260 


261 


FT 


STRAND 


262 


263 


FT 


TURN 


266 


269 


FT 


STRAND 


271 


275 


FT 


TURN 


277 


278 


FT 


STRAND 


282 


285 


FT 


HELIX 


286 


288 


FT 


HELIX 


289 


296 


FT 


TURN 


297 


297 


FT 


STRAND 


299 


300 


FT 


TURN 


302 


303 


FT 


STRAND 


306 


308 


FT 


HELIX 


310 


315 


FT 


STRAND 


319 


323 


FT 


TURN 


324 


325 


FT 


STRAND 


326 


330 


FT 


HELIX 


332 


335 


FT 


STRAND 


336 


338 


FT 


STRAND 


343 


345 


FT 


STRAND 


347 


350 


FT 


TURN 


355 


356 


FT 


STRAND 


360 


363 


FT 


HELIX 


365 


368 


FT 


TURN 


369 


370 


FT 


STRAND 


371 


376 


FT 


TURN 


377 


380 


FT 


STRAND 


381 


388 


SQ 


SEQUENCE 


388 AA; 


42 



F862DFDC1438BB92 CRC64 ; 



Query Match 13.1%; Score 353; DB 1; Length 388; 

Best Local Similarity 29.1%; Pred. No. 2.7e-18; 

Matches 12 0; Conservative 65; Mismatches 12 0; Indels 108; 



Gaps 21; 



Qy 

Db 



52 PAERHADG-LALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVD 110 

||:: ||:: II : I : I I : I : I I I I I I ■ I I 

50 PAWKYRFGDLSVTYEP MAYMD AAYFGE I S I GT P PQN FLVL FD 91 



Qy 

Db 

Qy 

Db 

Qy 



111 TGSSNFAV AGTPHS YI DT YFDTERS STYRSKGFDVTVKYTQGSWTGFVGEDL 

I I I I I | I I I I I : II I I : I : : : I I I I I I I I 

92 TGSSNLWVPSVYCQSQACTSHS RFNPSESSTYSTNGQTFSLQYGSGSLTGFFGYDT 



162 



147 



217 



163 VTIPKGFNTSFLVNIATIFESENFFLPG IKWNGILGLAYATLAKPSSSLETFFDS 

: I : I I III I I :: : I I : I II I I : : : I 

148 LTV QSIQVPNQEFGLSEN — E PGTN FVYAQ FD G I MGLAY PAL S VD EAT — TAMQG 198 

218 LVTQANIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQI 276 
: I : : : I I I : : I : : I I : : I I I : : I I I I I : : I : : I I : I I 



Db 



199 MVQEGALTS PVFSVYLSNQ QGSSGGAWFGGVDSSLYTGQIYWAPVTQELYWQI 252 



Qy 277 EILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGF 336 

I : MM: || : | | | | : | | : | | : | | : I : : : 
Db 253 GIEEFLIGGQASGW-CSE — GCQAIVDTGTSLLTVPQQYMSALLQA 295 

Qy 337 WTGSQLACWTNSETPWSYF PKI SI YLRDENSSRSFRITILPQLYIQPMMG 386 

I I : I I : I I : : : : I II 

Db 296 -TGAQ EDEYGQFLVNCNSIQNLPSLTFII NGVEFPLPPSSYI 336 

Qy 387 AGLNYECY-RFGISP STNA LVI GATVMEGFYVI FDRAQKRVGFAAS 4 31 

I : I I : I II : : I : : I : : I Mill: 

Db 337 — LSNNGYCTVGVEPTYLSSQNGQPLWILGDVFLRSYYSVYDLGNNRVGFATA 387 



RESULT 7 
PEPC CALJA 



ID PEPC_CALJA STANDARD; PRT; 388 AA. 

AC Q9N2D3; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Gastricsin precursor (EC 3,4.23.3) (Pepsinogen C) . 

GN PGC . 

OS Callithrix jacchus (Common marmoset) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Platyrrhini; Callitrichidae; 0 

OC Callithrix. 

OX NCBI_TaxID=9483; 

RN [1] 

RP SEQUENCE FROM N.A. , SEQUENCE OF 17-26, FUNCTION, AND ENZYME 

RP REGULATION. 

RC TISSUE=Gastric mucosa; 

RX MEDLINE=20250834; PubMed-107 8 878 4 ; 

RA Kageyama T . ; 

RT "New World monkey pepsinogens A and C, and prochymosins . Purification, 

RT characterization of enzymatic properties, cDNA cloning, and molecular 

RT evolution."; 

RL J. Biochem. 127:761-770(2000). 

CC -!- FUNCTION: Hydrolyzes a variety of proteins. 

CC -!- CATALYTIC ACTIVITY: More restricted specificity than pepsin A, but 

CC shows preferential cleavage at Tyr- | -Xaa bonds; high activity 

CC towards hemoglobin as substrate. 

CC -!- ENZYME REGULATION: Inhibited by pepstatin. 

CC -!- MISCELLANEOUS: The optimal pH is around 2. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AB038385; BAA90872.1; -. 

DR PIR; JC7246; JC7246. 



DR HSSP; P20142; 1AVF. 

DR MEROPS; A01.003; -. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR009007; Pept_A_acid. 

DR InterPro; IPR001461; Peptidase_Al . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP PROTEASE; 2. 



KW 


Hydrolase; 


Aspartyl 


protease; 


Zymogen; Digestion; Signal. 


FT 


SIGNAL 


1 


16 




FT 


PROPEP 


17 


59 


ACTIVATION PEPTIDE (BY SIMILARITY) 


FT 


CHAIN 


60 


388 


GASTRICSIN. 


FT 


ACT_SITE 


91 


91 


BY SIMILARITY. 


FT 


ACT_SITE 


276 


276 


BY SIMILARITY. 


FT 


DISULFID 


104 


109 


BY SIMILARITY. 


FT 


DISULFID 


267 


271 


BY SIMILARITY. 


FT 


DISULFID 


310 


343 


BY SIMILARITY. 


SQ 


SEQUENCE 


388 AA; 


42503 MW; 


0BC48DBD1F7D2D8C CRC64; 



Query Match 13.1%; Score 351.5; DB 1; Length 388; 

Best Local Similarity 30.1%; Pred. No. 3.5e-18; 

Matches 112; Conservative 56; Mismatches 115; Indels 89; Gaps 17; 



Qy 

Db 

Qy 

Db 



92 YYLEMLI GT P PQKLQI LVDTGS SN FAV AGTPHS YI DTYFDTERSSTYRSKGF 143 

I : I : I I I I I I : I I I I I I I I I I I I I : II I I I I 

73 YFGEISIGTPPQNFLVLFDTGSSNLWVPSVYCQSQACTSHS RFNPSASSTYSSNGQ 128 

144 DVTVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIATIFESENFFLPG IKWNGILG 198 

: : : I I I I I I I I : I : I I III II : : : | | : | 

129 TFSLQYGSGSLTGFFGYDTLTV QSIQVPNQEFGLSEN— EPGTNFVYAQFDGIMG 181 



Qy 

Db 

Qy 

Db 



199 LAYATIAKPSSSLETFFDSLVTQTVNIPN-VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPS 257 

III I : : : I : : : : : I I I : I : : I I : : : I I : : I 

182 LAYPALSMGGAT — TAMQGMLQEGALTSPVFSFYLSNQ QGS SGGAVI FGGVDS S 233 

258 LYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFD 317 

I I I I : : I : : I I : I I I : I I I I : II : I I I I : II : I I : I I : 

234 LYTGQIYWAPVTQELYWQIGIEEFLIGGQASGW-CSE — GCQAIVDTGTSLLTVPQQYMS 290 



Qy 

Db 



318 AWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYF PKISIYLRDENS 367 

I : I I I I : I I : I I : : : 
291 AFLEA TGAQ EDEYGQFLVNCDSIQNLPTLTFII 323 



Qy 

Db 

Qy 

Db 



368 SRSFRITILPQLYIQPMMGAGLNYECY-RFGISP STNALVI GATVMEGFYVI F 419 



: I II 

324 -NGVEFPLPPSSYI 



420 DRAQKRVGFAAS 431 

I Mill: 
376 D L GNN RVG FATA 38 7 



I : I I : I I : : I : : I : I 

LSNNGYCTVGVEPTYLSSQNSQPLWILGDVFLRSYYSVF 375 



RESULT 8 
PEPC_CAVPO 

ID PEPC_CAVPO STANDARD; PRT; 394 AA. 

AC Q64411; 

DT 15-JUL-1998 (Rel. 36, Created) 



DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Gastricsin precursor (EC 3.4.23.3) (Pepsinogen C) . 

GN PGC. 

OS Cavia porcellus (Guinea pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Hystricognathi ; Caviidae; Cavia. 

OX NCBI_TaxID=10141; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE^ 92 355614; PubMed= 1644829; 

RA Kageyama T., Ichinose M., Tsukada S., Miki K., Kurokawa K. , Koiwai 0., 

RA Tanji M., Yakabe E., Athauda S.B., Takahashi K. ; 

RT "Gastric procathepsin E and progas tricsin from guinea pig. 

RT Purification, molecular cloning of cDNAs, and characterization of 

RT enzymatic properties, with special reference to procathepsin E."; 

RL J. Biol. Chem. 267:16450-16459(1992). 

CC -!- CATALYTIC ACTIVITY: More restricted specificity than pepsin A, but 
CC shows preferential cleavage at Tyr- | -Xaa bonds; high activity 

CC towards hemoglobin as substrate. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC This SWISS-PROT entry is copyright. It is, produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M88652; AAA37053.1; 

DR PIR; B43356; B43356. 

DR HSSP; P20142; 1AVF. 

DR MEROPS; A01.003; -. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR009007; Pept_A_acid. 

DR InterPro; IPR001461; Peptidase_Al . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Zymogen; Digestion; Signal. 

FT SIGNAL 1 16 POTENTIAL. 

FT PROPEP 17 65 ACTIVATION PEPTIDE. 

FT CHAIN 66 394 GASTRICSIN. 

FT ACT_SITE 97 97 BY SIMILARITY. 

FT ACTJSITE 283 283 BY SIMILARITY. 

FT DISULFID 110 115 BY SIMILARITY. 

FT DISULFID 273 277 BY SIMILARITY. 

FT DISULFID 316 349 BY SIMILARITY. 

SQ SEQUENCE 394 AA; 42995 MW; 114 F08E105D4 9865 CRC64; 

Query Match 12.1%; Score 324.5; DB 1; Length 394; 

Best Local Similarity 29.0%; Pred. No. 3.2e-16; 

Matches 107; Conservative 63; Mismatches 116; Indels 83; Gaps 18; 

Qy 92 YYLEMLI GTPPQKLQI LVDTGS SNF AVAGTPHSYIDTYFDTERSSTYRSKGF 143 

I : : : : I I I I I I : I I I I I I I : : I I I I I : MM: 



Db 



79 YFGQISLGTPPQSFQVLFDTGSSNLWVPSVYCSSLACTTH TRFNPRDSSTYVATDQ 134 



Qy 144 DVTVKYTQGSWTGFVGEDLVTI PK-GFNTSFLVNIATIFESENFFLPG IK 192 

: : : I I I I I I I : I I I I I I I : I I I : 

Db 135 SFSLEYGTGSLTGVFGYDTMTIQDIQVPKQEFGLS ETE PGSDFVYAE 181 

Qy 193 WNGI LGLAYATLAKP S S S LET FFDS LVTQAN I - PNVFSMQMCGAGLPVAGS — GTNGGS L 249 

:: I I I I I I I : : : : I I : : : : : I I : : II I : : I I 

Db 182 FDGILGLGYPGLSEGGAT— TAMQGLLREGALSQSLFSVYL GSQQGSDEGQL 231 

Qy 250 VLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLL 309 

: | | | : : I I I I I I : : I I : : I I : I I I I I : I : I I I : I I : I I 

Db 232 ILGGVDESLYTGDIYWTPVTQELYWQIGIEGFLIDGSASGWCSR GCQGIVDTGTSLL 288 

Qy 310 RLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSR 369 

: | : I : I : I : : I : : | : : I I : 

Db 289 TVPSDYLSTLVQAIGAEE — NEYGEYF VSCSSIQDLPTLTFVISGV 332 

Qy 370 SFRITILPQLYIQP MMGAGLNYECYRFGISPSTN — ALVI GATVMEGFYVI FDRA 422 

: I || I : I I : I I : : I : : I : : I I 

Db 333 — EFPLSPSAYILSGENYCMVGLESTY VS PGGGEPVWI LGDVFLRS YYSVYDLA 384 

Qy 423 QKRVGFAAS 431 

I I I I I : 

Db 385 NNRVGFATA 393 



RESULT 9 




REN I 


MOUSE 




ID 


RENI MOUSE STANDARD; PRT; 4 02 AA. 




AC 


P06281; P97911; Q62153; Q62154; 




DT 


01-JAN-1988 (Rel. 06, Created) 




DT 


01-JAN-1988 (Rel. 06, Last sequence update) 




DT 


15-MAR-2004 (Rel. 43, Last annotation update) 




DE 


Renin 1 precursor (EC 3.4.23.15) (Angiotensinogenase) 


(Kidney renin) . 


GN 


RENI OR REN-1 OR REN. 




OS 


Mus musculus (Mouse) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; 


Murinae; Mus. 


OX 


NCBI TaxID-10090; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN-BALB/ c ; 




RX 


MEDLINE= 84182525; PubMed= 637 068 6; 




RA 


Holm I., Olio R., Panthier J. -J., Rougeon F. ; 




RT 


"Evolution of aspartyl proteases by gene duplication: 


the mouse renin 


RT 


gene is organized in two homologous clusters of four 


exons . " ; 


RL 


EMBO J. 3:557-562(1984). 




RN 


[2] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN-BALB/ c; TISSUE=Kidney; 




RX 


MEDLINE=90067953; PubMed=2 6857 61; 




RA 


Kim W.S., Murakami K., Nakayama K. ; 




RT 


"Nucleotide sequence of a cDNA coding for mouse Renl 


preprorenin. "; 


RL 


Nucleic Acids Res. 17:9480-9480(1989). 




RN 


[3] 




RP 


SEQUENCE FROM N.A. 





RC STRAIN=DBA/2, and C57BL/10; 

RX MEDLINE=90108722; PubMed=2 691339 ; 

RA Burt D.W., Mullins L.J., George H., Smith G. , Brooks J., Pioli D., 

RA Brarnmar W.J.; 

RT "The nucleotide sequence of a mouse renin-encoding gene, Ren-ld, and 

RT its upstream region."; 

RL Gene 84:91-104(198 9). 

RN [4] 

RP SEQUENCE OF 1-30 FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=84298161; PubMed=608 92 05 ; 

RA Panthier J. -J. , Dreyfus M. , Roux D.T.L., Rougeon F.; 

RT "Mouse kidney and submaxillary gland renin genes differ in their 5' 

RT putative regulatory sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 81:5489-5493(1984). 

RN [5] 

RP SEQUENCE OF 1-31 FROM N.A. 

RX MEDLINE=85085936; PubMed=6392 850 ; 

RA Field L.J., Philbrick W.M. , Howies P.N., Dickinson D.P., 

RA McGowan R.A. , Gross K.W.; 

RT "Expression of tissue-specific Ren-1 and Ren-2 genes of mice: 

RT comparative analysis of 5 1 -proximal flanking regions."; 

RL Mol. Cell. Biol. 4:2321-2331(1984). 

RN [6] 

RP SEQUENCE OF 22-37 AND 72-80. 

RC STRAIN=C57BL/10ROS X C3H/HEROS; TISSUE=Kidney ; 

RX MEDLINE-97182599; PubMed=9030738 ; 

RA Jones C.A. , Petrovic N., Novak E.K., Swank R.T., Sigmund CD., 

RA Gross K.W. ; 

RT "Biosynthesis of renin in mouse kidney tumor As 4.1 cells."; 

RL Eur. J. Biochem. 243:181-190(1997). 

CC -!- FUNCTION: Renin is a highly specific endopeptidase, whose only 
CC known function is to generate angiotensin I from angiotensinogen 

CC in the plasma, initiating a cascade of reactions that produce an 

CC elevation of blood pressure and increased sodium retention by the 

CC kidney. 

CC -!- CATALYTIC ACTIVITY: Cleaves Leu- | - bond in angiotensinogen to 

CC generate angiotensin I . 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- TISSUE SPECIFICITY: Kidney. 

CC -!- INDUCTION: Renal renin is synthesized by the juxtaglomerular cells 
CC of the kidney in response to decreased blood pressure and sodium 

CC concentration. 

CC -!- POLYMORPHISM: In inbred mouse strains, there are at least two 
CC alleles which can occur at the Renl locus: Ren-ID and Ren-lC. 

CC The sequence shown is that of Ren-lC. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X00810; CAA25391.1; -. 



DR 


EMBL, 


: X00811; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


; X00812; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


r X00813; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


r X00814; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


r X00815; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


• X00816; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


r X00850; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


: X00851; 


CAA25391. 


1; 


JOINED. 


DR 


EMBL, 


\ X16642; 


CAA34636. 


1; 


— 9 


DR 


EMBL, 


r K02596; 


AAA40045. 


1; 




DR 


EMBL, 


-r M32352; 


AAA40043. 


1; 




DR 


EMBL, 


; K02800; 


AAA40044. 


1; 




DR 


EMBL, 


r M34190; 


AAA40042. 


1; 




DR 


PIR; 


A00989; 


REMSK. 






DR 


HSSP; P00796; 


1SMR. 







DR MEROPS; A01.007; 

DR MGD; MGI: 97898; Renl. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR009007; Pept_A_acid. 

DR InterPro; IPR001461; Peptidase_Al . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP_PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Plasma; Glycoprotein; Zymogen; 



KW 


Signal . 












FT 


SIGNAL 


1 


21 








FT 


PROPEP 


22 


71 


ACTIVATION PEPTIDE. 






FT 


CHAIN 


72 


402 


RENIN 1. 






FT 


ACT_SITE 


102 


102 


BY SIMILARITY. 






FT 


ACT SITE 


287 


287 


BY SIMILARITY. 






FT 


DISULFID 


115 


122 


BY SIMILARITY. 






FT 


DISULFID 


278 


282 


BY SIMILARITY. 






FT 


CARBOHYD 


69 


69 


N-LINKED (GLCNAC. . 


.) 


(POTENTIAL) 


FT 


CARBOHYD 


139 


139 


N-LINKED (GLCNAC. . 


.) 


(POTENTIAL) 


FT 


CARBOHYD 


320 


320 


N-LINKED (GLCNAC. . 


.) 


(POTENTIAL) 


FT 


VARIANT 


58 


58 


W -> R (in Ren-ID) . 






FT 


VARIANT 


68 


68 


T -> I (in Ren-ID) . 






FT 


VARIANT 


160 


160 


S -> V (in Ren-ID) . 






FT 


VARIANT 


315 


315 


E -> D (in Ren-ID) . 






FT 


VARIANT 


352 


352 


N -> Y (in Ren-ID) . 






FT 


CONFLICT 


6 


23 


MISSING (IN REF. 1) 


* 




FT 


CONFLICT 


24 


24 


T -> I (IN REF. 1) . 






FT 


CONFLICT 


163 


163 


V -> VSRV (IN REF. 


1) . 




SQ 


SEQUENCE 


402 AA; 


44342 


MW; D42920B555E97A38 


CRC64 ; 



Query Match 11.9%; Score 320; DB 1; Length 402; 

Best Local Similarity 28.6%; Pred. No. 7e-16; 

Matches 126; Conservative 66; Mismatches 181; Indels 68; Gaps 21; 

Qy 10 LPLLAQWLLRAAPEl^PAPFTLPLRVAAATNRWAPTPG-PGTPAERHADG 65 

: I I I I I : I I : I I I I : I III hi 

Db 6 MPLWALLLL WSPCTFSLPTRTATFERIPLKKMPSVREILEERGVDMTRLSAEWGV 60 

Qy 66 PA LAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAV 118 

I : III I : I II 111:111111 : : : I I M : I I 

Db 61 FT KRPS LTN LT S PWLTN YL NTQ YYGEIGIGTPPQTFKVIFDTGSANLWV 110 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



119 AGTPHSY IDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVT1PKGFNTS 172 



I I 



: : : : I I : I 



I I I : I I 



I I : : I I I : I 



111 PSTKCSRLYLACGIHSLYESSDSSSYMENGSDFTIHYGSGRVKGFLSQDSVTV-GGITVT 169 
173 FLVNIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PNVFSMQ 231 



I I 



I : : I : I I 



II : : : I 



III: 



170 QTFGEVTELPLIPFML — AKFDGVLGMGFP — AQAVGGVT P VFDH I L SQGVLKEEVFS VY 225 
232 MCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLD 2 91 



I I 



I I : II I I : I I : I : I I : 



: I I : : : I I I 



226 Y NRGSHLLGGEWLGGSDPQHYQGNFHYVSISKTDSWQITMKGVSVG — SSTLL 277 

292 CREYNADKAIVDSGTTLLRLPQKVFDAWEAV-ARASLIPEFSDGFWTGSQLACWTNSET 350 



I I I 



: I I : I : : : I 



: I 



I : 



-WNC SQV 324 



278 CEEGCA — VWDTGSSFISAPTSSLKLIMQALGAKEKRIEEY- 



351 PWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGL-NYECYRFGISPSTNAL-VIG 408 

| I I I I | : : : : : I I : III : hi 

325 P — TLPDISFDL GGRAYTLSSTDYVLQYPNRRDKLCTLALHAMDIPPPTGPVWVLG 378 

409 ATVMEGFYVT FDRAQKRVGFA 42 9 

II: II III hill 
379 ATFIRKFYTEFDRHNNRIGFA 399 



RESULT 10 
CATD CLUHA 



ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RA 
RT 
RL 
RN 
RP 
RC 
RX 
RA 
RT 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 



396 AA. 



CATD_CLUHA STANDARD; PRT; 

Q9DEX3; 

28-FEB-2003 (Rel. 41, Created) 

28-FEB-2003 (Rel. 41, Last sequence update) 

10-OCT-2003 (Rel. 42, Last annotation update) 

Cathepsin D precursor (EC 3.4.23.5). 

Clupea harengus (Atlantic herring) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Actinopterygii; Neopterygii; Teleostei; Clupeomorpha ; Clupeidae; 
Clupea . 

NCBI_TaxID=7950; 
[1] 

SEQUENCE FROM N.A. 

Nielsen L.B., Stougaard P., Andersen P.S., Pedersen L.H.; 

"Cloning and sequence determination of herring muscle cathepsin D."; 

Submitted (OCT-2000) to the EMBL/ GenBank/DDB J databases. 

[2] 

SEQUENCE OF 62-82. 
TISSUE=Skeletal muscle; 
MEDLINE=2 1165469; PubMed^l 12 0 7447; 
Nielsen L.B., Nielsen H.H.; 

"Purification and characterization of cathepsin D from herring muscle 
(Clupea harengus)."; 

Comp. Biochem. Physiol. 128B : 351-363 ( 2001 ) . 

FUNCTION: Cathepsin D is an acid protease active in intracellular 
protein breakdown. 

CATALYTIC ACTIVITY: Specificity similar to, but narrower than, 
that of pepsin A. Does not cleave the 4-Gln- I -His-5 bond in B 
chain of insulin. 

ENZYME REGULATION: Inhibited by pepstatin. 



t _ 



_ t - 



_ i - 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 



! _ 

! - 
i _ 



_ i _ 



SUBUNIT: Monomer. 

SUBCELLULAR LOCATION: Lysosomal. 

MISCELLANEOUS: The isoelectric point is 6.8. Has optimal activity 
at pH 2.5 with hemoglobin as the substrate and the optimal 
temperature is 37 degrees Celsius . 
SIMILARITY: Belongs to peptidase family Al . 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; AF312364; AAG27733.1; 
HSSP; P07339; 1LYB. 
MEROPS; A01.009; -. 

InterPro; IPR001969; Aspprotease_AS . 

InterPro; IPR009007; Pept_A_acid. 

InterPro; IPR001461; Peptidase_Al . 

Pfam; PF00026; asp; 1. 

PRINTS; PR00792; PEPSIN. 

PROSITE; PS00141; ASP_PROTEASE; 2. 

Hydrolase; Aspartyl protease; Glycoprotein; Lysosome; Signal; Zymogen. 



FT 


SIGNAL 


1 


18 


POTENTIAL. 




FT 


PROPEP 


19 


61 


ACTIVATION PEPTIDE. 




FT 


CHAIN 


62 


396 


CATHEPSIN D. 




FT 


ACT SITE 


94 


94 


BY SIMILARITY. 




FT 


ACT_SITE 


281 


281 


BY SIMILARITY. 




FT 


DISULFID 


107 


114 


BY SIMILARITY. 




FT 


DISULFID 


272 


276 


BY SIMILARITY. 




FT 


DISULFID 


315 


352 


BY SIMILARITY. 




FT 


CARBOHYD 


131 


131 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


SQ 


SEQUENCE 


396 AA; 


43315 MW; D0375DC38567A31B 


CRC64; 


Query Match 




11.7%; 


Score 315.5; DB 1; 


Length 396; 




Best Local Similarity 


27.1%; 


Pred. No. 1.5e-15; 




Matches 112 ; 


Conservative 


65; Mismatches 141; 


Indels 95; 



18; 



QY 


50 


Db 


47 


QY 


110 


Db 


94 


QY 


164 


Db 


154 


QY 


219 


Db 


205 



50 GTPAERHADGLALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILV 109 

| | : : | I : I : I I : : I I I :: I I I I : : 

47 GTNSLQHNQGFPSSNAP TPETLKNYM DAQ YYGE I GLGT PVQMFT WF 93 

)TGSSNFAVAGTPHSYIDT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLV 163 

I I I I I : | : | : : : I I I I I : : : I I I : I : : : I 

94 DTGSSNLWLPSIHCSFTDIACLLHHKYNGAKSSTYVKNGTEFAIQYGSGSLSGYLSQDSC 153 



: I 



: I I 



I : : I I I I : I I 



I I : 



219 VTQANI-PNVFSMQMCGAGLPVAGSGTN* 



GGSLVLGGIEPSLYKGDIWYTPIKEEW 272 
I I I : I I I : I III II: - 



273 YYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLI 329 



I : I I : : | | | | I I : : =1111-11 = 1= I I : : I : II 

Db 255 YWQIHMDGMSIGSQ-LTL-CKD--GCEAIVDTGTSLITGPPAEVRALQKAIGAIPLIQGE 310 

Qy 330 PEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITILPQ 378 

I I I I : III I I : : : : : : I 

Db 311 YMI DCKKVPTLPT IS-- FNVGGK TYSLTGEQY VLKESQGGKTICLSGLMG 358 

Qy 379 LYIQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFDRAQKRVGFAAS 431 

Ml I : : : I : : I : I I I I I I I I I 

Db 359 LEIPP PAGPLWI LGDVFI GQYYTVFDRESNRVGFAKS 395 



RESULT 11 
APR1_0RYSA 

ID APR1_0RYSA STANDARD; PRT; 509 AA. 

AC Q42456; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Aspartic proteinase oryzasin 1 precursor (EC 3.4.23.-). 

OS Oryza sativa (Rice) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 

OC Ehrhartoideae; Oryzeae; Oryza. 

OX NCBI_TaxID=4530; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Nipponbare / Japonica; TISSUE=Seed; 

RX MEDLINE=96048031; PubMed=755617 4 ; 

RA Asakura T. f Watanabe H., Abe K., Arai S.; 

RT "Rice aspartic proteinase, oryzasin, expressed during seed ripening 

RT and germination, has a gene organization distinct from those of 

RT animal and microbial aspartic proteinases."; 

RL Eur. J. Biochem. 232:77-83(1995). 

CC -!- DEVELOPMENTAL STAGE: Seed ripening and germination. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC ' — 

DR EMBL; D32165; BAA06876.1; 

DR EMBL; D32144; BAA06875.1; 

DR PIR; S66516; S66516. 

DR HSSP; P42210; 1QDM. 

DR MEROPS; A01.02 0; -. 

DR Gramene; Q42456; -. 



DR InterPro 

DR InterPro 

DR InterPro 

DR InterPro 

DR InterPro 

DR InterPro 



IPR001969; Aspprotease_AS . 

I PRO 09 007; Pept_A_acid. 

IPR0014 61; Peptidase_Al. 

IPR007856; SapB_l. 

IPR008138; SapB_2 . 

IPR008140; SapB sub. 



DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



InterPro; IPR008373; Saposin. 
InterPro; IPR008139; SaposinB. 
Pfam; PF00026; asp; 1. 
Pfam; PF05184; SapB_l; 1. 
Pfam; PF03489; SapB_2; 1. 
PRINTS; PR00792; PEPSIN. 
PRINTS; PR01797; SAPOSIN . 
ProDom; PD001732; SapB_sub; 1. 
SMART; SM00118; SAPB; 2. 
PROSITE; PS00141; ASP^PROTEASE; 2. 

Hydrolase; Aspartyl protease; Zymogen; Glycoprotein; Signal. 



SIGNAL 


1 


20 


POTENTIAL. 




PROPEP 


21 


67 


POTENTIAL. 




CHAIN 


68 


509 


ASPARTIC PROTEINASE 


ORYZASIN 1. 


DOMAIN 


318 


416 


SPECIFIC TO PLANT ASPARTIC PROTEINASES 








(BY SIMILARITY) . 




ACT SITE 


103 


103 


BY SIMILARITY. 




ACTJSITE 


290 


290 


BY SIMILARITY. 




DISULFID 


116 


122 


BY SIMILARITY. 




DISULFID 


281 


285 


BY SIMILARITY. 




DISULFID 


428 


465 


BY SIMILARITY. 




CARBOHYD 


252 


252 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


CARBOHYD 


400 


400 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) . 


SEQUENCE 


509 AA; 


54145 MW; 182F5DADA4CBE358 


CRC64 ; 


Query Match 




11.7%; 


Score 313.5; DB 1; 


Length 509; 


Best Local Similarity 


23.0%; 


Pred. No. 2.9e-15; 




Matches 127 ; 


Conservative 


75 ; Mismatches 17 9; 


Indels 171; Gap 



19; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



ALARALLLPLLAQWLLRAAPELAPAPFTLPLRVAAATNRWAPTPGPGTP7VERHADGLAL 

: : I I I : I I I I I : I II : I I I I I I I I 

SVALVLIAAVLLQALLPASAEEGLVRIALKKRPIDENSRVAARLSG 



62 



EEGARRLGL 59 



115 



63 ALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSN 

: | | : : | : : : I : I : : I I I I I I :: I I I I I I 

60 RGANSLGGGGGEGDIVALKNYMNAQ YFGEIGVGTPPQKFTVIFDTGSSNLWVPSAK 115 

116 — FAVAGTPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGFNTSF 173 

| : : I || : : : I I I I : I : : I II II MM: 
116 CYFSIACFFHS RYKSGQSSTYQKNGKPAAIQYGTGSIAGFFSEDSVTVGD 165 

174 LVNIATIFESENFF LPGI KWNGILGLAYATLAKPSSSLETFFDSLVTQANI 224 

: : : I I I : I : : I I I I I : : : : : 

166 LVVKDQEFIEATKEPGLTFMVAKFDGILGLGFQEISVGDA V 206 

225 PNVFSMQMCG-AGLPVAGSGTN GGSLVLGGIEPSLYKGDIWYTPIKEEWYYQI 276 

| : | | || I I I : I I I : : I I I I I : I I : : : I : I 

207 PWYKMVEQGLVSEPVFSFWFNRHSDEGEGGEIVFGGMDPSHYKGNHTYVPVSQKGYWQF 266 



Qy 

Db 

Qy 

Db 



277 EILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPE 331 

I : : I I I : : I : I I I I I I : I I I - > I : : : : 

267 EMGDVLIGGKTTGF-CA — SGCSAIADSGTSLLAGPTAI ITEINEKIGATGWSQECKTV 323 



332 



FSDGF- 



336 



324 VSQYGQQILDLLLAETQPSKICSQVGLCTFDGKHGVSAGIKSWDDEAGESNGLQSGPMC 383 



Qy 



337 WTGSQLACWTNSETPWSY FPKISIYLRD 364 

I : I I I : : I I : I I : 

Db 384 NACEMAVVWMQNQLAQNKTQDLILNYINQLCDKLPSPMGESSVDCGSLASMPEISFTIGA 443 



Qy 365 ENSSRSFRITILPQLYIQPMMGAGLNYECY — - — RFGISPSTNAL-VIGATVMEGFYVIF 419 

: : : I : I I : I I : I II I - I I : : : I 

D b 444 K KFALKPEEYIL-KVGEGAAAQCISGFTAMDIPPPRGPLWILGDVFMGAYHTVF 4 96 

Qy 420 DRAQKRVGFAAS 431 

I : I I I I I I 

Db 4 97 DYGKMRVGFAKS 508 



RESULT 12 
PEPC_RAT 

ID PEPC_RAT STANDARD; PRT; 392 AA. 

AC P04073; 

DT 01-NOV-1986 (Rel. 03, Created) 

DT 01-NOV-1986 (Rel. 03, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Gastricsin precursor (EC 3.4.23.3) (Pepsinogen C) . 

GN PGC . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Wistar; 

RX MEDLINE=89255508; PubMed=2722 863; 

RA Ishihara T., Ichihara Y . , Hayano T., Katsura I., Sogawa K., 

RA Fujii-Kuriyama Y., Takahashi K. ; 

RT "Primary structure and transcriptional regulation of rat pepsinogen C 

RT gene . " ; 

RL J. Biol. Chem. 264:10193-10199(1989). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Wistar; 

RX MEDLINE-87054020; PubMed=37 8074 1 ; 

RA Ichihara Y., Sogawa K. , Morohashi K. , Fujii-Kuriyama Y., Takahashi K. ; 

RT "Nucleotide sequence of a nearly full-length cDNA coding for 

RT pepsinogen of rat gastric mucosa."; 

RL Eur. J. Biochem. 161:7-12(1986). 

RN [3] 

RP SEQUENCE OF 16-112. 

RC STRAIN=Wistar; 

RX MEDLINE- 84257697; PubMed= 6743670; 

RA Arai K.M., Muto N . , Tani S., Akahane K. ; 

RT "The N-terminal sequence of rat pepsinogen."; 

RL Biochim. Biophys. Acta 788:256-261(1984). 

CC -!- CATALYTIC ACTIVITY: More restricted specificity than pepsin A, but 

CC shows preferential cleavage at Tyr- | -Xaa bonds; high activity 

CC towards hemoglobin as substrate. 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 



cc 
cc 
cc 
cc 
cc 
cc 



DR 
DR 
DR 
DR 
DR 
DR 
DR 



the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



DR 


EMBL; 


f M25993; 


AAA41827. 


1; 


• 


DR 


EMBL, 


; M25985; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


; M25986; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


; M25987; 


AAA41827 . 


1; 


JOINED. 


DR 


EMBL, 


; M25988; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


; M25989; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


? M25990; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


; M25991; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


; M25992; 


AAA41827. 


1; 


JOINED. 


DR 


EMBL, 


; X04644; 


CAA28305. 


1; 


• 


DR 


PIR; 


A33510; 


A24608. 






DR 


HSSP; P20142; 


1AVF. 







MEROPS; A01.003; -. 

InterPro; IPR001969; Aspprotease_AS . 

InterPro; IPR009007; Pept_A_acid. 

InterPro; IPR001461; Peptidase_Al . 

Pfam; PF00026; asp; 1. 

PRINTS; PR00792; PEPSIN. 

PROSITE; PS00141; ASP PROTEASE; 2. 



KW 


Hydrolase; 


Aspartyl 


protease; 


Zymogen; Digestion 


FT 


SIGNAL 


1 


16 




FT 


PROPEP 


17 


62 


ACTIVATION PEPTIDE. 


FT 


CHAIN 


63 


392 


GASTRICSIN. 


FT 


ACT_SITE 


94 


94 




FT 


ACT SITE 


280 


280 




FT 


DISULFID 


107 


112 


BY SIMILARITY. 


FT 


DISULFID 


270 


275 


BY SIMILARITY. 


FT 


DISULFID 


314 


347 


BY SIMILARITY. 


FT 


CONFLICT 


31 


31 


E -> Q (IN REF. 3) . 


FT 


CONFLICT 


103 


103 


S -> A (IN REF. 3) . 


FT 


CONFLICT 


109 


109 


S -> L (IN REF. 3) . 


SQ 


SEQUENCE 


392 AA; 


42833 MW; 


092A5EAF27 83EDD1 




Query Match 




11. 6%; 


Score 313; DB 1; 




Best Local Similarity 


29.5%; 


Pred. No. 2.2e-15; 


Matches 105; 


Conservative 56; Mismatches 139; 



Indels 



56; Gaps 16; 



QY 


92 


Db 


76 


Qy 


144 


Db 


132 


Qy 


199 


Db 


185 


Qy 


257 



I : I : I I I I I I : I I I I I I I I 



III: 



I : : I I I I 



I I I I I I I I : I : 



I I 



I I I 



I I 



I I 



I : I I I : I II:: 



257 SLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVF 316 
: I | I : I : I : : I I : II I II I : I : I I I : I I : I I : I : 



Db 



236 NLYTGEITWVPVTQELYWQITIDDFLIGDQASGW-CSSQGC-QGIVDTGTSLLVMPAQYL 293 



Qy 317 DAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 376 

: : : : I : : I : : | : | I : I I : : 

D b 294 SELLQTIGAQE--GEYGEYF VSCDSVSS LPTLSFVL NGVQFPLS 335 

Qy 377 PQLY-IQPMMGAGLNYECYRFGISPSTNALVIGATVMEGFYVIFDRAQKRVGFAAS 431 

| | | | : | : : I : : I I I I : I I I I 

Db 336 PSSYIIQEDNFCMVGLESISLTSESGQPLWILGDVFLRSYYAIFDMGNNKVGLATS 391 



RESULT 13 
PEPE_CHICK 

ID PEPE_CHICK STANDARD; PRT; 383 AA. 

AC P16476; 

DT Ol-AUG-1990 (Rel. 15, Created) 

DT Ol-AUG-1990 (Rel. 15, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Embryonic pepsinogen precursor (EC 3.4.23.-). 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88227903; PubMed-3131317 ; 

RA Hayashi K., Agata K. , Mochii M. , Yasugi S., Eguchi G., Mizuno T.; 

RT "Molecular cloning and the nucleotide sequence of cDNA for embryonic 

RT chicken pepsinogen: phylogenetic relationship with prochymosin . " ; 

RL J. Biochem. 103:290-296(1988). 

cc _t_ DEVELOPMENTAL STAGE: Specifically secreted during the embryonic 
CC period in the chicken proventriculus (glandular stomach) . 

CC -!- SIMILARITY: Belongs to peptidase family Al . 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D00215; BAA00153.1; -. 

DR PIR; A41443; A41443. 

DR HSSP; P00794; 4 CMS . 

DR MEROPS; A01.028; -. 

DR InterPro; IPR001969; Aspprotease__AS . 

DR InterPro; IPR009007; Pept_A_acid. 

DR InterPro; IPR001461; Peptidase_Al . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 

DR PROSITE; PS00141; ASP^PROTEASE; 2. 

KW Hydrolase; Aspartyl protease; Digestion; Signal; Glycoprotein. 

FT SIGNAL 1 16 POTENTIAL. 

FT CHAIN 17 383 EMBRYONIC PEPSINOGEN. 

FT ACT SITE 94 94 BY SIMILARITY. 




FT ACT_SITE 276 276 BY SIMILARITY.. 

FT DISULFID 107 112 BY SIMILARITY. 

FT DISULFID 267 271 BY SIMILARITY. 

FT DISULFID 310 344 BY SIMILARITY. 

FT CARBOHYD 132 132 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 204 204 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 309 309 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT CARBOHYD 350 350 N-LINKED (GLCNAC. . .) (POTENTIAL). 

FT VARIANT 51 51 T -> S. 

SQ SEQUENCE 383 AA; 41719 MW; 164279687 1611F54 CRC64; 

Query Match 11.5%; Score 310; DB 1; Length 383; 

Best Local Similarity 26.8%; Pred. No. 3.5e-15; 

Matches 106; Conservative 63; Mismatches 136; Indels 90; Gaps 15; 

Qy 56 HA — DGLALALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGS 113 

||||:||| III 11:111111 :: I I I I 

Db 55 HAFPDVLTWTEPLL NTLDM EYYGTISIGTPPQDFTWFDTGS 97 

Qy H4 SNFAVAG TPHSYIDTYFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIPKGF 169 

| | : I I : : | | | | : I I : : : : I I I I I I M : 

Db 98 SNLWVPSVSCTSPACQSHQMFNPSQSSTYKSTGQNLSIHYGTGDMEGTVGCDTVTVASLM 157 

Q y 170 NTSFLWIATIFESENFFLPGIKWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PNVF 228 

: | : | : : | I I I : : I : : I I I I I I : I I : : | I : : I : : : hi 
D b 158 DTNQLFGLST-SEPGQFFV-YVKFDGILGLGYPSLA — ADGITPVFDNMVNESLLEQNLF 213 

Qy 229 SMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSL 288 

| : : : I : I I I I : I : I I : I = : I : I I : = = I : 

Db 214 SVYLSREPM GSMWFGGI DES YFTGS INWI PVS YQGYWQI SMDS I IVNKQEI 265 

Qy 28 9 NLDCREYNADKAIVDSGTTLLRLPQKVFDAWEAVARASLIPEFSDGFWTGSQLACWTNS 348 

: : I I : I : I I : I : I : : I I I 
Db 266 ACS S GCQAI I DT GT S LVAG PAS D I N D I Q S AVG ANQ 300 

Qy 349 ETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNY ECY 394 

|| I I : I : : : I I : I I 

Db 301 NTYGEY SVNCSHILAMPDWF — VIG-GIQYPVPALAYTEQNGQGTCM 345 

Qy 395 RFGI S PSTNALVI GAT VMEGFYVI FDRAQKRVGFA 429 

: I : : : | : : I I I I I I I II I 
Db 346 S S FQNS S7VDLWI LGDVFI RVYYS I FDRANNRVGLA 380 



RESULT 14 
CAT D_HUMAN 

ID CATD_HUMAN STANDARD; PRT; 412 AA. 

AC P07339; 

DT 01-APR-1988 (Rel. 07, Created) 

DT 01-APR-1988 (Rel. 07, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Cathepsin D precursor (EC 3.4.23.5). 

GN CTSD. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
OX NCBI TaxID=9606; 



RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=85270436; PubMed=39272 92 ; 

RA Faust P.L., Kornfeld S., Chirgwin J.M. ; 

RT "Cloning and sequence analysis of cDNA for human cathepsin D." ; 

RL Proc. Natl. Acad. Sci. U.S.A. 82:4910-4914(1985). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=87231068; PubMed=358 8310 ; 

RA Westley B.R., May F.E.B.; 

RT "Oestrogen regulates cathepsin D mRNA levels in oestrogen responsive 

RT human breast cancer cells."; 

RL Nucleic Acids Res. 15:3773-3786(1987). 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=91299158; PubMed=20697 17 ; 

RA Redecker B., Heckendorf B., Grosch H.W., Mersmann G., Hasilik A.; 

RT "Molecular organization of the human cathepsin D gene."; 

RL DNA Cell Biol. TO : 423-431 (1991) . 

RN [4] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B. , Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L. , 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P.J. f McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A. , Rodrigues S., Sanchez A. , 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G. f 

RA Blakesley R.W., Touchman J.W. r Green E.D., Dickson M.C., 

RA Rodriguez A.C. f Grimwood J., Schmutz J . , Myers R.M. 9 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E. f 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 
RN [5] 

RP SEQUENCE OF 1-22 FROM N.A. 

RX MEDLINE= 94085791; PubMed= 8262386; 

RA May F.E., Smith D.J., Westley B.R.; 

RT "The human cathepsin D-encoding gene is transcribed from an estrogen- 
RT regulated and a constitutive start point."; 

RL Gene 134:277-282(1993). 
RN [6] 

RP SEQUENCE OF 1-22 FROM N.A. 

RX MEDLINE-95021301; PubMed=7 9354 85 ; 

RA Augereau P., Miralles F., Cavailles V., Gaudelet C, Parker M., 
RA Rochef ort H. ; 

RT "Characterization of the proximal estrogen-responsive element of 
RT human cathepsin D gene."; 



RL Mol. Endocrinol. 8:693-703(1994). 

RN [7] 

RP SEQUENCE OF 170-180. 

RC TISSUE=Liver; 

RA Hochstrasser D.F., Frutiger S., Paquet N . , Bairoch A., Ravier F. , 

RA Pasquali C, Sanchez J.-C, Tissot J.-D., Bjellqvist B., Vargas R., 

RA Appel R.D., Hughes G.J.; 

RL Submitted (JUN-1992) to Swiss-Prot. 

RN [8] 

RP CARBOHYDRATE- LINKAGE SITE ASN-263. 

RX MEDLINE-22660472; PubMed=12754 519 ; 

RA Zhang H., Li X.-J., Martin D.B., Aebersold R. ; 

RT "Identification and quantification of N-linked glycoproteins using 

RT hydrazide chemistry, stable isotope labeling and mass spectrometry."; 

RL Nat. Biotechnol. 21:660-666(2 003). 

RN [9] 

RP VARIANT VAL-58. 

RX MEDLIN E=2 017 9010; PubMed= 10716266; 

RA Papassotiropoulos A. , Bagli M. , Kurz A. , Kornhuber J., Forstl H., 

RA Maier W. , Pauls J . , Lautenschlager N., Heun R. ; 

RT "A genetic variation of cathepsin D is a major risk factor for 

RT Alzheimer's disease."; 

RL Ann. Neurol. 47:399-403(2000). 

RN [10] 

RP X-RAY CRYSTALLOGRAPHY (3 ANGSTROMS) . 

RC TISSUE=Spleen; 

RX MEDLINE=93223670; PubMed=8467789; 

RA Metcalf P., Fusek M. ; 

RT "Two crystal structures for cathepsin D: the lysosomal targeting 

RT signal and active site." ; 

RL EMBO J. 12:1293-1302(1993). 

RN [11] 

RP X-RAY CRYSTALLOGRAPHY (2.5 ANGSTROMS). 

RC TISSUE=Liver; 

RX MEDLINE-93342076; PubMed=8393577 ; 

RA Baldwin E.T., Bhat T.N., Gulnik S., Hosur M.V., Sowder R.C. II, 

RA Cachau R.E., Collins J., Silva A.M., Erickson J.W.; 

RT "Crystal structures of native and inhibited forms of human cathepsin 

RT D: implications for lysosomal targeting and drug design."; 

RL Proc. Natl. Acad. Sci. U.S.A. 90:6796-6800(1993). 

CC -!- FUNCTION: Acid protease active in intracellular protein breakdown. 
CC Involved in the pathogenesis of several diseases such as breast 

CC cancer and possibly Alzheimer's disease. 

cc _i- CATALYTIC ACTIVITY: Specificity similar to, but narrower than, 
CC that of pepsin A. Does not cleave the 4-Gln- | -His-5 bond in B 

CC chain of insulin. 

CC -!- SUBUNIT: Consists of a light chain and a heavy chain. 

CC -!- SUBCELLULAR LOCATION: Lysosomal. 

cc POLYMORPHISM: The Val-58 allele is significantly overrepresented 

CC in demented patients (11.8%) compared with nondemented controls 

CC (4.9%). Carriers of the Val-58 allele have a 3.1-fold increased 

CC risk for developing AD than noncarriers. 

CC -!- SIMILARITY: Belongs to peptidase family Al. 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 
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DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



JOINED. 
JOINED. 
JOINED. 
JOINED. 

1; 



use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; M11233; AAB59529.1 
EMBL; X05344; CAA28955.1 
EMBL; M63138; AAA51922.1 
EMBL; M63134; AAA51922.1 
EMBL; M63135; AAA51922.1 
EMBL; M63136; AAA51922.1 
EMBL; M63137; AAA51922.1 
EMBL; BC016320; AAH16320 
EMBL; L12980; AAA16314.1 
EMBL; S74689; AAD14156.1 
EMBL; S52557; AAD13868.1 
PIR; A25771; KHHUD. 
PDB; 1LYA; 31-JAN-94 
PDB; 1LYB; 31-JAN-94 
PDB; 1LYW; 22-JUL-99 
MEROPS; A01.009; -. 
SWISS-2DPAGE; P07339 
Siena-2DPAGE; P07339; -. 
Genew; HGNC:2529; CTSD. 
MIM; 116840; -. 

GO; GO:0004192; Frcathepsin D activity; TAS . 
InterPro; IPR001969; Aspprotease__AS . 
InterPro; IPR009007; Pept_A_acid. 
InterPro; IPR001461; Peptidase_Al . 



HUMAN. 



DR 


Pfam; PF00026; asp; 


1. 




DR 


PRINTS; PR00792; PEPSIN. 




DR 


PROSITE; 


PS00141; ASP PROTEASE; 2. 


KW 


Hydrolase 


; Aspartyl 


protease 


; Glycoprotein; Lysosome; 


KW 


Polymorph 


ism; Alzheimer's disease; 3D-structure . 


FT 


SIGNAL 


1 


18 




FT 


PROPEP 


19 


64 


ACTIVATION PEPTIDE. 


FT 


CHAIN 


65 


412 


CATHEPSIN D. 


FT 


CHAIN 


65 


161 


CATHEPSIN D LIGHT CHAIN 


FT 


CHAIN 


169 


412 


CATHEPSIN D HEAVY CHAIN 


FT 


ACT_SITE 


97 


97 




FT 


ACT SITE 


295 


295 




FT 


DISULFID 


91 


160 




FT 


DISULFID 


110 


117 




FT 


DISULFID 


286 


290 




FT 


DISULFID 


329 


366 




FT 


CARBOHYD 


134 


134 


N-LINKED (GLCNAC. . .). 


FT 


CARBOHYD 


263 


263 


N-LINKED (GLCNAC. . .). 


FT 


VARIANT 


58 


58 


A -> V (ASSOCIATED WITH 


FT 








AD; POSSIBLY INFLUENCES 


FT 








INTRACELLULAR MATURATION 


FT 








/FTId=VAR_011621. 


FT 


STRAND 


67 


74 




FT 


TURN 


75 


77 




FT 


STRAND 


78 


85 




FT 


TURN 


86 


89 




FT 


STRAND 


90 


97 




FT 


TURN 


98 


99 





(PROBABLE) 
(PROBABLE) 



FT 


STRAND 


103 


107 


FT 


TURN 


108 


109 


FT 


TURN 


112 


113 


FT 


HELIX 


115 


118 


FT 


TURN 


119 


119 


FT 


STRAND 


123 


123 


FT 


HELIX 


125 


127 


FT 


TURN 


129 


130 


FT 


STRAND 


132 


141 


FT 


STRAND 


146 


158 


FT 


STRAND 


172 


184 


FT 


HELIX 


188 


192 


FT 


STRAND 


197 


200 


FT 


HELIX 


204 


206 


FT 


HELIX 


208 


210 


FT 


HELIX 


214 


220 


FT 


TURN 


221 


222 


FT 


STRAND 


228 


233 



Query Match 11-5%; Score 308.5; DB 1; Length 412; 

Best Local Similarity 27.1%; Preci. No. 5e-15; 

Matches 121; Conservative 75; Mismatches 180; Indels 71; Gaps 22; 

Qy 9 L L P L LAQWL L RAAP E LAP AP FT L P L RVAAAT N RWAP T P G PGTPAERHADGLAL 62 

I I I I I I I I I I : I I : I : : I | : : : : 

Db 6 LLPLAL — CLLAAP — ASALVRI PLHKFT S I RRTMS EVGGSVEDLI AKGPVS KYSQAVPA 61 

Qy 63 ALEPALAS PAGAANFLAMVDNLQGDSGRGYYLEMLIGTPPQKLQILVDTGSSNFAVAGTP 122 

| : | I : : I I I : I I I I I I : : I I I I I I 

Db 62 VTEGPI — PEVLKNYM DAQYYGEIGIGTPPQCFTWFDTGSSNLWVPSIH 109 

Qy 123 HSYIDT YFDTERSSTYRSKGFDVTVKYTQGSWTGFVGEDLVTIP — KGFNTSFL 174 

: I : : : : : | | | | I : I I I : I : : : I I : : I : I I 

Db 110 CKLLDIACWIHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASAL 169 

Qy 175 — VNIATIFESENFFLPGI KWNGILGLAYATLAKPSSSLETFFDSLVTQANI-PN 226 

I : I III | : : | | | | : I I : : : : : I I : I : I : I 

Db 170 GGVKVERQVFGEATKQPGITFIAAKFDGILGMAYPRIS — VNNVLPVFDNLMQQKLVDQN 227 

Qy 227 VFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQ 286 

: I I I I I I : I I I : I I I : I : : I : I : : : : I : 

Db 228 IFSFY LSRDPDAQPGGELMLGGTDSKYYKGSLSYLNVTRKAYWQVHLDQVEV-AS 281 

Qy 287 SLNLDCREYNADKAIVT»SGTTLLRLPQKVFDAVV^VARASLIPEFSDGFWTGSQIACWT 346 

I I I : I : I I I I : I I : I : I III : I : : I 

Db 282 GLTL-CKE — GCEAIVDTGTSLMVGP VDEVRELQKAIGAVPLIQGEY MIPC — 329 

Qy 347 NSETPWSYFPKISIYLRDENSSRSFRITILPQLYIQPMMGAGLNYECYRF GISPSTN 4 03 

I | I : : I : : : : : | : | : I I I I I : 

Db 330 EKVSTLPAITLKL GGKGYKLS — PEDYTLKVSQAGKTLCLSGFMGMDIPPPSG 380 

Qy 404 AL-VI GATVMEGFYVI FDRAQKRVGFA 42 9 

I :: I : : I : I I I I I I I I 
Db 381 PLWILGDVFIGRYYTVFDRDNNRVGFA 407 



RESULT 15 



CATD_MOUSE 

ID CATD_MOUSE STANDARD; PRT; 410 AA. 

AC P18242; 

DT 01-NOV-1990 (Rel. 16, Created) 

DT 01-NOV-1990 (Rel. 16, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Cathepsin D precursor (EC 3.4.23.5). 

GN CTSD. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain; 

RX MEDLINE=91088345; PubMed=22 63503 ; 

RA Diedrich J.F., Staskus K.A., Retzel E.F., Haase A.T.; 

RT "Nucleotide sequence of a cDNA encoding mouse cathepsin D."; 

RL Nucleic Acids Res. 18:7184-7184(1990). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-90326544; PubMed=2374732 ; 

RA Grusby M.J., Mitchell S.C., Glimcher L.H.; 

RT "Molecular cloning of mouse cathepsin D." ; 

RL Nucleic Acids Res. 18:4008-4008(1990). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RX MEDLINE-94280622; PubMed-8011168 ; 

RA Hetman M. , Perschl A., Saftig P., von Figura K., Peters C; 

RT "Mouse cathepsin D gene: molecular organization, characterization of 

RT the promoter, and chromosomal localization. 1 '; 

RL DNA Cell Biol. 13:419-427(19 94). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Brain, and Mammary gland; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B. , Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W. , Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



!- FUNCTION: Acid protease active in intracellular protein breakdown. 
!- CATALYTIC ACTIVITY: Specificity similar to, but narrower than, 
that of pepsin A. Does not cleave the 4-Gln- I -His-5 bond in B 
chain of insulin. 

- SUBUNIT: Consists of a light chain and a heavy chain. 

- SUBCELLULAR LOCATION: Lysosomal. 

- SIMILARITY: Belongs to peptidase family Al . 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 
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DR MEROPS; A01.009; -. 

DR MGD; MGI : 8 8562 ; Ctsd. 

DR InterPro; IPR001969; Aspprotease_AS . 

DR InterPro; IPR009007; Pept_A_acid. 

DR InterPro; IPR001461; Peptidase_Al . 

DR Pfam; PF00026; asp; 1. 

DR PRINTS; PR00792; PEPSIN. 
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410 


CATHEPSIN D. 




FT 


ACT SITE 


97 


97 


BY SIMILARITY. 




FT 


ACT SITE 


293 


293 


BY SIMILARITY. 




FT 


DISULFID 


91 


160 


BY SIMILARITY. 




FT 


DISULFID 


110 


117 


BY SIMILARITY. 




FT 


DISULFID 


284 


288 


BY SIMILARITY. 




FT 


DISULFID 


327 


364 


BY SIMILARITY. 




FT 


C ARB OH YD 


134 


134 


N-LINKED (GLCNAC. . 


. ) (BY SIMILARITY) . 


FT 


CARBOHYD 


261 


261 


N-LINKED (GLCNAC. . 


. ) (BY SIMILARITY) . 


SQ 


SEQUENCE 


410 AA; 


44954 MW; DC4928EC46928BF0 


CRC64; 


Query Match 




11.4%; 


Score 306.5; DB 1; 


Length 410; 




Best Local Similarity 


27.5%; 


Pred. No. 6.9e-15; 




Matches 103; 


Conservative 


64 ; Mismatches 123 ; 


Indels 85 ; Gaps 



15; 



QY 
Db 



92 YYLEMLIGTPPQKLQILVDTGSSNFAVAGTPHSYIDT 
I I :: I I I I I I :: I I I I I I I : I 



YFDTERSSTYRSKGFDV 145 
: : : : : I I I I I 



79 YYGDIGIGTPPQCFTWFDTGSSNLWVPSIHCKILDIACWVHHKYNSDKSSTYVKNGTSF 138 



Qy 146 TVKYTQGSWTGFVGEDLVTIPKGFNTSFLVNIAT IFESENFFLPGI KWNGIL 197 

: | | | : | : : : | | : : I : | I I I I III I : : I M 

Db 139 DIHYGSGSLSGYLSQDTVSVPCKSDQSKARGIKVEKQIF-GEATKQPGIVFVAAKFDGIL 197 

Qy 198 GLAYATLAKPSSSLETFFDSLVTQANI-PNVFSMQMCGAGLPVAGSGTNGGSLVLGGIEP 256 

I : I : : : : : I I : I : I : I : I I • I I I I I : I I I : 

Db 198 GMGYPHIS — VNNVLPVFDNLMQQKLVDKNIFSFY LNRDPEGQPGGELMLGGTDS 250 

Qy 2 57 SLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVF 316 

I | : : | : : I : I : : : M : I : I I I : : I I I I : I I : I I I : 

Db 251 KYYHGELSYLNVTRKAYWQVHMDQLEVGNE-LTL-CK— GGCEAIVDTGTSLLVGPVEEV 306 

Qy 317 DAWEAVARASLIPEFSDGFWTGSQLACWTNSETPWSYFPKISIYLRDENSSRSFRITIL 376 

: : I : I I : I : : : I 

Db 307 KELQKAIGAVPLI QGEYMIPCEKVSSL 333 

Qy 377 PQLYIQPMMGAGLNYEC YRFGIS P S TNALVI GAT VMEG 414 

I : I : : : I I I I I I : I I I : : I : 

Db 334 PTVYLK — LG-GKNYELHPDKYILKVSQGGKTICLSGFMGMDIPPPSGPLWILGDVFIGS 390 

Qy 415 FYVI FDRAQKRVGFA 429 

: I : I I I I I II I 

Db 391 YYTVFDRDNNRVGFA 405 



Search completed: March 4, 2004, 15:36:24 
Job time : 18.5319 sees 



\ 



